
Brushing up on Computer Science Part 1, Big O - messel
http://www.victusspiritus.com/2011/03/14/brushing-up-on-computer-science-part-1-big-o/
======
Homunculiheaded
Personally, I think everyone who writes code for fun or profit should have a
copy of the CLRS[0] nearby. The first few chapters are enough of have an
excellent understanding of the topic of Big O. I would not recommend reading
the book cover to cover, but each chapter is pretty self contained and after
you get used to reading the book, makes for great weekend reading.

The thing with algorithms is you don't always see how they are useful if you
aren't familiar with them, being able to quickly identify a hard problem as
hard and have a solution already available is very valuable (even if you have
to research the solution). If you code for fun, these are the funnest problems
to think about. If you code for profit, even if you almost never need to use
sophisticated algorithms in real software, making use of them even just once
should give you a pretty good advantage over your competitors (since truth be
told, outside of Google and the like most software developers do not have a
copy of CLRS on their desk)

[0]
[http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&...](http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=11866)

~~~
ominous_prime
That's one of the few hard-copy books I keep around (I don't like wasting dead
trees for something that will be outdated in a few years). I don't think it
makes for a good self-teaching book for many people, because it's very long
and dry, though it's still a great reference.

------
shortlived
For those looking to actually practice calculating Big O for functions, here
are some useful problem sets, with answers:

<http://www.cs.wustl.edu/~sg/CSE241_SP08/hw.html>

Problem set for Big O and recursion:

[http://www-cs-faculty.stanford.edu/~eroberts/courses/cs106b/...](http://www-
cs-faculty.stanford.edu/~eroberts/courses/cs106b/handouts/23-section-3.pdf)
(see last page)

and answers:

[http://www-cs-faculty.stanford.edu/~eroberts/courses/cs106b/...](http://www-
cs-
faculty.stanford.edu/~eroberts/courses/cs106b/handouts/23a-section-3-solutions.pdf)

I also recommend the algorithms course from ArsDigita,

<http://aduni.org/courses/algorithms/>

------
summitpush
The link to the "Plain English Explanation of Big O" is the best part of this
article.

~~~
messel
I concur, and I'm the author. It's a wonderful introduction to Big O for folks
that struggle with it. My post is merely a tool to help refresh my knowledge
of fundamental concepts. William Shields does an excellent job of making Big O
approachable to non-CS majors.

The only passage I take issue with is his description of O(n) arithmetic
examples. In my thinking addition/multiplication are O(1) but I blame that
view on years of thinking about FLOPs and letting thoughts of # of operations
conflate with big O notation. In practicethe example of adding two large
integers is a single operation, or at least a fixed size handful operations
for arbitrary precision.

------
HenryR
Nowhere does this article provide a definition for big-O notation. Also some
important details seem to be missed - in particular the fact that O(f(n))
describes an asymptotic _upper bound_ on f(n). For example, searching on an
unsorted array is O(n!), it's just not a very tight bound.

The linked-through article is much better.

edit: more precise wording

------
bane
Next up? Big-Omega and little-o.

[http://en.wikipedia.org/wiki/Big_O_notation#Family_of_Bachma...](http://en.wikipedia.org/wiki/Big_O_notation#Family_of_Bachmann.E2.80.93Landau_notations)
[http://en.wikipedia.org/wiki/Big_O_notation#Little-
o_notatio...](http://en.wikipedia.org/wiki/Big_O_notation#Little-o_notation)

The information on wikipedia is hopelessly inadequate for the layman to learn
from. I'd suggest somebody should edit it but it'd probably just get reverted.

Big-O is one of those concepts that is fantastically important for anyone
doing algorithm design. I wouldn't go so far as to say that one has to master
it, but thinking deep thoughts about the complexity of your algorithm can have
amazing results.

This is one of those topics that, in my experience, few self-taught developers
know about. A little story to illustrate. My wife was not really much into
typical computer developer stuff (nerd things, programming for fun, playing
with legos, etc.), but she was very strong in math and managed to work through
a B.S. in C.S. During college she interned for a fantastic self-taught
developer. I say fantastic because he managed to write a rock solid, very
scalable e-commerce back-end for a very large and very active site. It was
seriously impressive stuff.

Where he was struggling was with writing the reporting system. His system was
gathering tremendous amounts of data, but aggregating and pooling that data
into a meaningful and interactive reporting system had him completely lost.
His best shot at it took somewhere between 11 to 12 hours to compile the
results...or crash sometime during the night.

Eventually management wisely put the project on a backburner over other front-
end tasks with a strong desire to revisit it soon. My wife was given the task
to look at the code and see if she could do anything with it. After going
through the code and experimenting for a few days, she discovered that the
developer, in keeping things simple for himself was storing all of the data
for aggregation in associative arrays (this was his convention and it actually
really did make the code easy to maintain the way he used them). She dug into
the associative arrays a bit and started calculating on a notepad the
complexity of allocating millions of these things in memory. Very quickly she
realized that there were two issues.

1) The associative arrays took a long time to allocate in memory, lots of
bookkeeping variables and pointers were allocated with each one.

2) Each associative array ate up x bytes in memory. As they were allocated, in
about 11 or 12 hours they would consume all available system memory,
explaining the observed behavior.

So she set out, with a notepad to come up with a data structure and algorithm
to do the task as optimally as possible and without consuming system memory.

Before ever writing any more code, she had come up with a solution. Then she
went about rewriting a bunch of the system and it worked! In fact it was so
fast that it could complete the task in about 30-45 seconds...meaning that it
was feasible to make the process interactive rather than a nightly batch
process.

When asked to explain how she arrived at the solution, my wife went into a
discussion of Big-O and how she realized the old system showed growth that was
far too fast to work, so she tried to set out and come up with a solution on a
slower growth curve. Her developer boss had never heard of Big-O before, but
the real-life example he had just seen blew him away.

He ended up learning the topic and applying it back to the rest of his work. A
couple years later they spoke again and he said it was probably the single
most important thing he had ever learned as it had opened up entire new
projects for him that he thought were undoable as well as entirely new ways of
analyzing data structured and algorithms that he had never considered.

~~~
scott_s
I'm having difficulty reconciling these two statements,

 _His best shot at it took somewhere between 11 to 12 hours to compile the
results...or crash sometime during the night._

 _the developer, in keeping things simple for himself was storing all of the
data for aggregation in associative arrays_

With this one,

 _I say fantastic because he managed to write a rock solid, very scalable
e-commerce back-end for a very large and very active site. It was seriously
impressive stuff._

My reason is that I have difficulty calling someone a "fantastic developer" if
they don't have a basic understanding of algorithms, data structures and
systems. (If something takes a long time or crashes, run-away memory usage
should be high on the list of suspects if you have basic systems knowledge.)

~~~
bane
Yeah, I can understand why that might seem off. I'm using "fantastic" to mean
"built really rock solid stuff, from scratch, on his own without any formal
education". The e-commerce site, being a bunch of smallish transactions was
much easier and more obvious to scale than a huge numbers crunch on a single
system (this all predates the common usage of map-reduce algorithms which
probably would have been an even better fit in the end).

Plus there's tons of great tech out there that helps with scaling a site (load
balancers, sharding databases, etc.) that he didn't have to come up with, but
not a lot for building custom reporting tools to spec for business types. Most
of the requirements for the reporting tool weren't too bad -- "how many of
this sku did we sell in the last 24 hrs, 7 days, 30 days, year, forever?" and
could be solved pretty quickly and trivially. But scale something like that
up, across all skus, correlated to individual users, or user locations, or
supplier/warehouse availability, or whatever of another 100 variables and
things get complicated very very fast, then aggregate those results and
provide meaningful output to business types, and we're starting to talk
Computer Science vs. Development.

The story wasn't mean to pump my wife up over her boss, at the end of the day,
she solved one problem using her education, but she probably wouldn't be able
to sit down and build what he built elsewhere. There's a tremendous amount of
practical engineering knowledge that she (and I) just don't have (and don't
learn in school) that her boss did have -- mostly things you just have to
learn that kind of thing on the job.

I guess it really comes down to the semantics of developer. He's a great
software engineer type, he can plug components together to build bewilderingly
complex systems, but at the time he had a very naive view of algorithm
design...which is a very different discipline.

~~~
scott_s
It is a semantics issue, but just like I expect mechanical engineers to have a
strong grasp of Newtonian mechanics, I expect software engineers to have a
strong grasp of algorithms and data structures.

~~~
bane
I suspect at some point the development community is going to have to come up
with better names to call people.

A person who is fantastic with web technologies may not be great with real-
time audio processing tools or graphics processing tools or Natural Language
systems. But a person who is a genius at with NLP may not be able to cobble
together a website or write an algorithm for fast approximations of an inverse
square function. A developer who understands how to squeeze every last cycle
out of an embedded system will likely be lost in a massive enterprise
environment and a person who specializes in machine learning may stink at
writing efficient database and storage systems. A developer great at crypto
systems might be laughed at if he wrote a video game. Someone who specializes
in low level cache optimization may have absolutely no clue how to write an
e-commerce system. And a lifelong mainframe developer would be lost writing
device drivers.

I suspect that most people aren't _great_ at everything, even top-notch
developers. More likely people are really great at one thing, and pretty good
at a couple others and lousy at the rest.

There's a few exceptions I'm sure, but if your definition of "great developer"
only includes those people who are great at every bit of software development
trivia then we're probably talking about only a handful of people on the
planet.

Algorithm optimization is _hard_ and often not-obvious (how many people really
understood the algorithm for the fast inverse square root? I bet thousands of
developers used it, even plenty of great ones, but groking it is so rare that
a series of articles was actually written to explain it)

There are plenty of perfectly competent and solid developers (even great) who
can hack together great stuff in their domain, but struggle with even the
basics just outside of that. An autodidact will probably be pragmatic with
what they learn, and most of that will probably be plumbing work since that's
what's strictly required to get something up and working.

I see looking at your page that you specialize in high-performance computing
and parallel systems. Are only developers who understand those topics great
developers? Because most developers don't do well with those topics. Even the
dev shop at Valve has struggled with parallelizing their software.

Saying "I expect mechanical engineers to have a strong grasp of Newtonian
mechanics", do you also expect them to have a strong grasp of non-Newtonian
mechanics? Since Newton's equations are simply a special case of relativity,
why shouldn't they know how to calculate the load stresses on a mechanism
using Einstein's much more precise work? Shouldn't they really have a
fantastic understanding of particle physics and quantum mechanics too? I mean,
the binding forces that hold together the materials they work with every day
are subject to that basic knowledge!

Alas! Now I'm concerned that the plucky civil engineer, when designing the
overpass I take to get home from work today, didn't quite get the section on
QCD Color Charge because he mistook the professor's sloppy handwriting of
lambda for an 'A'!

Perhaps my downtrodden electrical engineer is hiding a secret shame, he's
never understood how to accurately calculate ballistic trajectories over the
surface of the Earth! Oh sure it looked so simple in class, but they never
went over how to actually do it, taking into account variable atmospheric
resistance, and a suitable model of the Earth's actual shape, and GRAVITY! Oh
gravity...why do you have to subtly change as the mistargetted missile drifts
lazily across your forlorn belly! It's just dif-EQ after all! Why then, did
his water ballon slingshot fail to hit his neighbors house precisely where he
wanted it?

I suddenly feel a great pity for these noble engineers, overcoming such
obstacles, such gaps in their basic knowledge! Yet still my overpass stands,
and still my car works, and still my lights turn on. Only because of this
spark of competence shall their secret, their hidden failings, remain safe
with me.

~~~
scott_s
While I agree that more accurate terms would be a good thing, I don't think my
expectations are all that high - nor does it only include "people who are
great at every bit of modern software development." And I _certainly_ did not
imply that only people who are good in what I'm good at are good developers.

What I do expect is for "good developers" to have, _at least_ , a firm grasp
of the material covered in most undergraduate Computer Science curriculums.
It's funny that you also use "real" engineers as an example, because we know
that any practicing engineer does have a firm grasp of the fundamentals of
their discipline: they had to take a licensing exam which ensured so.

~~~
bane
Maybe we should do a poll of HN? How many people know the details of algorithm
design w/r to the Family of Bachmann–Landau notations?

"It's funny that you also use "real" engineers as an example, because we know
that any practicing engineer does have a firm grasp of the fundamentals of
their discipline: they had to take a licensing exam which ensured so."

But do they understand the fundamentals in the examples I provided? My Civil
Engineering friend (with a PhD such) doesn't know the first thing about QCD
and he'll gladly tell you that. It's all just physics though right?

 _edit_ poll is submitted.

~~~
scott_s
I never said details. I just expect a firm grasp of the fundamentals. If I say
"This algorithm is linear," or "This algorithm is quadratic," or even, "This
algorithm is linear in the worst case, but constant in the average case," I
expect them to know what I mean. Anyone who pulls a B average in an undergrad
CS program should be able to do that.

I find your example with civil engineering and QCD disingenuous, because one
doesn't need quantum theory when building buildings, bridges and roads. My
point in mentioning Newtonian mechanics is that it is the _foundation_ of
mechanical engineering. I never said "it's all just physics," so I don't know
why you did.

~~~
bane
Considering that the developer in question was described as an autodidact and
lacked formal training, your rebuttals are just becoming pedantic.

 _I find your example with civil engineering and QCD disingenuous, because one
doesn't need quantum theory when building buildings, bridges and roads._

You conflate fundamental academics with fundamental knowledge.

People were building all kinds of great things before Newton. Lots of that
stuff is still standing, some of it is still functioning.

Here's some examples:

<http://en.wikipedia.org/wiki/Eupalinian_aqueduct>

<http://en.wikipedia.org/wiki/Pantheon,_Rome>

<http://en.wikipedia.org/wiki/Appian_way>

<http://en.wikipedia.org/wiki/Forbidden_city>

<http://en.wikipedia.org/wiki/Taj_mahal>

[http://en.wikipedia.org/wiki/List_of_aqueducts_in_the_Roman_...](http://en.wikipedia.org/wiki/List_of_aqueducts_in_the_Roman_Empire)

<http://en.wikipedia.org/wiki/Macchu_pichu>

Civil Engineering not your thing? Here's some mechanical engineering

<http://en.wikipedia.org/wiki/Antikythera_mechanism>

<http://en.wikipedia.org/wiki/Cannon>

<http://en.wikipedia.org/wiki/Aeolipile>

<http://en.wikipedia.org/wiki/Water_clock>

<http://en.wikipedia.org/wiki/Catapult> \-- you couldn't _possibly_ make one
of these without knowing Newton's laws of motion...

<http://en.wikipedia.org/wiki/Bellows>

<http://en.wikipedia.org/wiki/Chain_drive>

<http://en.wikipedia.org/wiki/Crank_(mechanism)>

Here's some pre-Newton celestial mechanics

<http://en.wikipedia.org/wiki/Armillary_sphere> \-- some even hydraulically
powered!

<http://en.wikipedia.org/wiki/Gnomon>

<http://en.wikipedia.org/wiki/Chinese_calendar>

<http://en.wikipedia.org/wiki/Antikythera_mechanism>

These guys got along more or less alright without fundamental Newtonian
mechanics

<http://en.wikipedia.org/wiki/Leonardo_da_Vinci>

<http://en.wikipedia.org/wiki/Hero_of_Alexandria>

<http://en.wikipedia.org/wiki/Archimedes>

I could go on, but you probably get the point.

To hear you tell it, humans were just scratching around in the dirt until
Newton lifted us into civilization on the wings of F=ma.

One doesn't _need_ big-O to write software either -- even great software. I
know that in academia these kinds of things are seen as absolutely
fundamental, every published algorithm paper should at _least_ show complexity
for big-O (at a minimum) to be taken seriously. And that's great for your
milieu, but you need to come down from Mt. Olympus and mix with the rest of us
mere mortals and just build stuff. I guarantee you can build all sorts of
wondrous, magical things without ever even _considering_ big-O.

~~~
scott_s
I'm sorry if you find me pedantic, but I'm honestly trying to get a point
across - and obviously failing. This discussion has been frustrating for me
because I thought my point was self-evident, but most of your responses seem
like non-sequiturs to me. I also think you're being unnecessarily sarcastic
and antagonistic - Mt. Olympus and the like.

~~~
bane
ahh, I'm just ribbing ya.

You _do_ have a point, which I agree with, learning about complexity theory is
a tremendously useful skill to have as a developer.

Is it required to be a good one? You say yes (absolutely), I say no (but it
doesn't hurt).

Fair enough?

------
Montagist
Big-O...one of those things I never managed (or cared?) to pickup regardless
of how many times it'd been drummed into me in school. Any article that
explains it in an approachable way is fine by me.

~~~
adobriyan
More or less strict definition of O() requires limits first which isn't
exactly an easy topic in high school.

