
Complexity theory isn’t required to be a programmer - nkurz
http://mortoray.com/2015/03/24/complexity-theory-isnt-required-to-be-a-programmer/
======
tptacek
This is to me a deeply weird article.

Thematically, it makes a case for complexity not being important to software
development. I disagree, but stipulating that the author is correct, where
does that slippery slope end? Virtually no formalized CS knowledge is required
to make expert use of Microsoft Excel, which is itself an extremely important
programming environment. What parts of CS does this analysis _include_ in the
requirements for development? Is it important to know how a hash function
works if your language gives you a dictionary ADT? Is it important to
understand the memory hierarchy if you're just shuttling things into and out
of an SQL database? There's a rather large population of "Rails programmers"
and "PHP programmers" that need know very little more than an "HTML
programmer" does to get their job done.

But it's the specifics of the argument that really bug me. The author seems to
be making a case that even if you understand complexity, it's hard to deploy
it in a real application. That idea seems totally alien to me. I don't
routinely analyze the complexity of a sort or a shortest-paths graph
reduction, but I certainly feel like I benefit from understanding how the
typical algorithms used to solve those problems scale with the size of their
inputs. Simpler: any time you select a binary tree instead of a hash table for
your lookup table because you realize you need range queries or sorted
traversal, aren't you leaning on complexity theory?

~~~
Alex3917
> Simpler: any time you select a binary tree instead of a hash table for your
> lookup table because you realize you need range queries or sorted traversal,
> aren't you leaning on complexity theory?

Only a nominal amount. You probably only need to spend a few hours watching
YouTube videos to make these sorts of decisions, or pass a phone screen.
Whereas there are entire classes and textbooks that cover only complexity
theory.

Software is a weird industry because the best and most famous developers tend
to be the ones who make software for other developers. And those folks tend to
know a ton about software engineering. But the actual folks using those tools
don't need to know all that much, and they're generally the ones making all
the money.

~~~
judk
That goes for _anything_. A carpenter needs to know wood theory, but doesn't
need to know how to genetic engineer new species of trees.

------
j2kun
This article betrays the author's misunderstanding of complexity theory.
Complexity theory is about finding algorithms which provably optimize some
notion of efficiency. Any notion of efficiency. Complexity theory is not
equivalent to using asymptotic notation to describe things. In particular,
when you find an algorithm that has, say, O(n) worst-case runtime, then you
know that the time-complexity of the problem you're trying to solve is O(n).
It could be better, like O(log n), and the goal of complexity theory is to do
better or prove you can't. You could also try to find the query complexity of
a learning problem in terms of the accuracy desired, or the round complexity
of a problem solved by MapReduce algorithms in terms of the number of
processors and the size of the input.

> How does the algorithm perform in concurrent computing? Clearly we’d prefer
> an algorithm that can scale over processors to one that can’t ... Certainly
> there is a way to put this all together in theory, but I don’t know those
> details.

There are no details to know. Any time you have any parameter of any system
which can grow (processors, input size, time, space, cache layers) you can
describe _anything depending on that parameter_ using asymptotic notation to
get a rough estimate of its efficiency. This is not a question of complexity,
just a question of how to express growth rates succinctly. Knowing how to
reason about rough estimates is essential for engineers because they have to
have some idea about why the thing they're building will conform to
specifications at least in principle before devoting the resources to
measuring it precisely.

And to reiterate: none of this has anything to do with complexity theory
except that in complexity theory we happen to use asymptotic notation very
heavily. The author's problem is with mathematical language being used in the
workplace.

~~~
tptacek
Yes, that's a compelling point, but to be clear: he's saying that the
asymptotic analysis is itself not useful to programmers. We don't even reach
the question of whether the _field_ of complexity theory is relevant to
programmers.

~~~
j2kun
Exactly. He's using the wrong words to describe his point, which is that he
doesn't think this particular mathematical language is useful. To someone who
is fluent in that language, he makes it clear that he doesn't know how to use
the language effectively (to stretch it, at times it doesn't seem like he even
understands the semantics). But indeed, how could any sort of language be
useful unless you know how to use it effectively?

------
aruss
I don't know why we keep passing around this kind of anti-intellectual, faux-
controversial drivel. Very little is _required_ to be a programmer (whatever
it means to be a programmer). No, you don't need to be an expert on algorithm
analysis or complexity theory or some other CS theoretical thing to be a
programmer, but it's quite obviously better to be able to reason in a certain
way about certain problems (and of course asymptotic analysis isn't everything
when dealing with practical applications).

~~~
Crito
> _" I don't know why we keep passing around this kind of anti-intellectual,
> faux-controversial drivel."_

There is a rather vocal minority in this community who think that attending
universities is a waste of time. These people are naturally interested in
notion that things commonly taught in universities, but frequently glossed
over when self-teaching or at "hacker schools", are not necessary.

~~~
alanh
> not necessary

No education is necessary, but that doesn’t make it devoid of value. Speaking
as a US citizen, I truly believe what my country and countless others around
the globe desperately need is better and accessible education, absolutely
including the Arts.

The opposing viewport can likely be called anti-intellectual in fairness.

~~~
Crito
Perhaps I should have mentioned that I am not a member of this minority.

If I am hiring for a developer position, I consider at least a basic knowledge
of complexity to be necessary. There are too many developers who are familiar
with it for me to waste my time on ones that are not.

I am not speaking of "necessary" in a universal sense, since that would be
nearly impossible to define and a useless concept even if we could. If we
really get down to it, _breathing_ isn't _necessary_...

------
steven2012
It's not required to be a programmer, much like learning how to drive manual
is not required to be a driver.

However, if you want someone to pay you the high salary to be a race car
driver, then you better know how to drive manual.

There are plenty of positions where people can program and do programming
tasks without a great deal of depth of knowledge in terms of programming. And
they can make a living doing this.

~~~
moron4hire
Bad analogy, Formula 1 drivers use selectable-gear, automatic transmissions.
It's more efficient to have a computer run the clutch.

~~~
gh02t
F1 cars use sequential gearboxes, which aren't the same as automatic
(actually, they're different from typical manual transmissions as well).

They _do_ have driver-operated clutches, though... actually there's TWO of
them, usually the bottom two paddles on the left and right of the wheel.
They're only necessary when starting the car from stationary, after that the
computer takes over. It's actually notoriously difficult to operate them, if
you take a look at this video of Richard Hammond driving an F1 car, he takes
several tries to get it moving.

[http://www.youtube.com/watch?v=9773pisjCSw](http://www.youtube.com/watch?v=9773pisjCSw)

Not to mention, there are other types of highly-paid race car drivers.

------
serve_yay
Indeed, you don't!

And we all look forward to your app being featured on
[http://accidentallyquadratic.tumblr.com](http://accidentallyquadratic.tumblr.com)
:)

------
Kenji
"Consider what O(n) means: nothing. That’s right, if somebody tells me
something is O(n) it truthfully tells me nothing on its own."

You, my good sir, are totally mistaken. O(n) is a set of functions that grow
at most linearly with n, ignoring constants. O(n) is a rich mathematical
notation that tells a _lot_ of things on its own. If someone tells you that
something 'is' O(n) what they actually mean is that they have a function that
is _an element_ of O(n). Again, a very concise and powerful statement on its
own. Of course you can't say "this tree is O(n)" because a tree is not a
function. That'd be like saying "this apple is dumb". Don't blame the
notation, blame the user.

------
kasabali
It is sad but true. Complexity theory isn't required to be a programmer, it
isn't required to release products and that's one of the reasons why a lot of
software product we directly/indirectly use everyday sucks so much.

~~~
vezzy-fnord
I think the main reason for Wirth's law stinging as much as it does, isn't so
much because of ignorance of algorithmic complexity. Rather, it's because most
programmers (and all the languages, libraries, frameworks and various assorted
tools and materials that they use) still operate on a mental model of how
computer architecture works that is literally decades out of date.

Mainstream infrastructure hasn't done a stellar job in seamlessly abstracting
new hardware features (though compiler construction is still decent there in
many regards). The one that has, isn't usually mainstream.

------
z5h
Complexity theory isn't required when prototyping an application. It is
required when engineering an application.

~~~
waprin
Not necessarily. In fact it can get in the way because complexity theory
ignores all those ugly little constant numbers. More important than complexity
theory is actually doing your own measuring and benchmarking, something I've
seen forgotten during engineering attempts far more often.

~~~
jerf
"In fact it can get in the way because complexity theory ignores all those
ugly little constant numbers."

No, it doesn't. A given person may be, but the theory does not. It so happens
that complexity theory makes heavy use of a mathematical construct, the "big-O
notation", that makes it easy to ignore low-order term's impact on a process,
but that is a tool it uses, not the totality of the theory.

I assure you that if you walk up to a complexity theorists and say "Your
ignoring of constant factors makes your entire discipline worthless!" you're
more likely to get laughed at than blow their minds.

You want to analyze the real impact of some algorithm in the face of having
Big Ints and dealing with cache issues? Complexity theory is ready for you.
I've seen it done. The equations get nastybig fast, but fortunately computer
algebra systems can chew right them. Amusingly, in the end you end up doing
something very similar to O() anyhow and just start semi-arbitrarily chopping
off insignificant terms to figure out what's actually going on in the enormous
expression. But this can still be a useful exercise to get a sense of where
the low-order terms may be dominating, for how long, and whether there's
anything useful to do about it.

------
moron4hire
I think programmers _should_ need to know complexity theory, because if there
is a "programmer" type job that can be done without knowing it, then we should
be working hard to eliminate that job through automation. In other words,
dissolve the class of programmers that supposedly don't need to know
complexity theory.

Also, big-O is not the be-all, end-all of complexity theory. What a ludicrous
sentiment.

~~~
j2kun
Unfortunately, nobody has yet figured out how to automate the process of
encoding arbitrary sets of business rules written by non-programmers into
code.

~~~
moron4hire
The problem isn't that the tools don't exist. The problem is that the work is
organized to make someone not-the-manager think about the details. It's not
just "arbitrary" business rules that need to be encoded, it's "poorly defined,
poorly thought-out" business rules.

That's why SQL didn't take off as a lay-person interface to databases; even
though it was meant to be a system anyone could use, you can't get around the
fact that you still need to know what you want and what comprises what you
want. Your typical middle-manager type knows neither.

That's also why UML can't be converted to a program: if it did, it would call
managers out on their inconsistent logic and the manager would just give up
and make someone else do it. The fact that it doesn't compile is _a feature_ ,
one that enables UML to make managers feel like they are being a part of the
technical process.

It's not a technical problem, it's a cultural one.

------
inverba
One thing the author doesn't mention is np-completeness. You need complexity
theory to introduce/explain the concept of np-complete problems and I would
argue that knowledge of those are necessary for a good programmer. Otherwise
you might waste a lot of time trying to find a polytime algorithm for
something which (arguably) doesn't have one.

------
Veedrac
> If we’re dealing with collections people tend to mean time complexity,
> because most collections have a space complexity of O(n). I say most, since
> a common one, the radix tree, does not.

Does it not? It can't be less than O(n) simply by the pigeonhole principle and
it doesn't look like it's more than that.

~~~
LukeShu
I think your confusion comes from the fact that n does not refer to the number
of elements, it refers to the sum of the sizes of all of the elements. Let the
number of elements be m, for this discussion.

For elements that have a constant/uniform size, this distinction is pointless.
However, for things like strings, there can be a large variance in the size.

Applying the pigeonhole principle informs us that the radix tree has a lower
bound of O(m); but remember that O(n) could be quite a bit larger than that. A
radix tree works by only storing one copy of duplicated prefixes of the
elements; it still has an upper-bound space complexity of O(n), but the
expected complexity is lower than that.

~~~
SamReidHughes
It depends on how your computer works. If pointers are magic O(1) values, then
sure. But if you need as many bits in a pointer as necessary to distinguish
the total number of pointers, then you don't get an asymptotic size advantage.

Also, using n as the sum of the sizes of all the elements is not normal.

~~~
LukeShu

        > If pointers are magic O(1) values
    

Um... pointers _are_ O(1) size. Usually either a fixed 32 or 64 bits.

    
    
        > then you don't get an asymptotic size advantage.
    

I even said that! I specifically said that it's still O(n) upper bound, but
that the expected case is lower than that.

Though to be fair, I should have written Ω(m) instead of O(m) when discussing
the lower bound.

    
    
        > using n as the sum of the sizes of all the elements is
        > not normal.
    

1\. I didn't say it was normal; just that in this case, that's what it was
referring to.

2\. It is though! Worded another way, n is the size of the data being stored.

~~~
SamReidHughes
>Um... pointers are O(1) size. Usually either a fixed 32 or 64 bits.

If you want to talk about asymptotic growth of an algorithm, you need it
defined on a machine model that supports infinite memory. Saying that pointers
are special O(1) size values are a reasonable way to do that. Another way is
to let them be variable size, and include that cost in your measurements. With
your 64-bit pointers, you're overpaying that cost. And the transition from 32
bit pointers to 64 bit pointers is an object lesson on the real existence of
this logarithmic factor that you'd like to pretend doesn't exist.

Edit: also, going with O(1) pointers, the expected asymptotic space usage
doesn't mean anything without specifying the probability distribution of
dictionaries involved (maybe parameterized on the size n). It's not hard to
construct one that changes the outcome.

~~~
LukeShu
Ok, fair.

It would have been correct for the original author to have written that most
collections have ϴ(n) space complexity, where radix trees only have O(n) space
complexity.

It's common for people to use O when they mean ϴ or Ω. All the original
statement was saying is that most collections use as much data as you put in,
but some (like radix tree) can use less.

(for fun, if we don't give O(1) pointers: If I'm not mistaken, radix tree has
O(m*log(m)+n), Ω(m) space-complexity, which is O(n) iff the average size of an
element is ≥ log(m); which it almost certainly would be)

~~~
Veedrac
> which is O(n) iff the average size of an element is ≥ log(m); which it
> almost certainly would be

That's stronger than a mere "almost certainly". At least some constant
fraction (dependent on the alphabet) would be of size Ω(log m) for some base
also dependent on the alphabet, so the average size is also Ω(log m).

I'm not sure how you get your best case, though. It seems to me that the best
case has at least Θ(m) nodes since there are m contained strings. The best
case for this is each being of minimum size - which is constant size. Each
node but the first has a pointer to it of size Ω(log m), so the overall
minimum cost is Ω(m log m).

------
chucksmart
If you distinguish a 'software engineer' from a 'programmer' this article
makes sense to me. But the article does not even use the word 'reduction'
which seemed to be 90% of what we did for homework in my computational
complexity class.

------
iphone7166
Yes you could get things done most of the time without knowing it, but why
bother? It is not hard to learn and you'll need it anyways someday if you code
enough.

------
j2kun
Considering that people were having typical days of programming before
complexity theory existed, of course it's not required. I don't know of
anybody who seriously claims that it is required.

~~~
kristjan
Complexity theory had been explored, if not deeply, somewhat before what we'd
consider "typical days of programming":
[https://en.wikipedia.org/wiki/Computational_complexity_theor...](https://en.wikipedia.org/wiki/Computational_complexity_theory#History)

~~~
j2kun
I suggest you read that more closely. There was no theory laid out until the
late 60's, and "complexity theory" was not a field until the 70's. On the
other hand, there is a long list of programming languages invented in the
50's.

------
crimsonalucard
In english we tend to use complexity to refer more to how intricate a system
is rather then algorithmic run time. It's confusing because programming is
involved with the intricacy of systems in a big way. That's why imho, the word
'complexity' is actually a really bad naming choice. Typically when I refer to
complexity, I refer to it as complexity in Design and structure.

Programming deals more with this type of design complexity for which currently
no theory exists. How do quantify whether one program is more intricate then
another? How do we quantify whether the functional style is less complex then
an object oriented style? There's no way. Since no theory exists, it largely
comes down to using intuition and design to deal with complexity. In short, no
theory required, only because no theory exists (yet).

~~~
tptacek
Isn't that just another facet of the same concept? The property you're
referring to as "intricacy" is quantifiable and scales with the kinds of
problems a program solves.

~~~
crimsonalucard
How is it quantifiable? Which is more intricate, Linux or Windows? I don't
think that question is answerable in any way that's meaningful. If you ask
anyone that question you get a lot of qualitative descriptions and opinions,
nothing definitive.

~~~
tptacek
IPC boundaries, interface size, basic block count and CFG arity, number of
different storage classes, number of different objects, number of different
object lifecycles, &c.

I'm not saying there's one best known way to measure the complexity of a
system, just that it is measurable.

~~~
crimsonalucard
These are black box metrics, better then counting lines of code, but
essentially never truly able to describe the complexity of a system. To my
knowledge, there's not even a strict theoretical definition describing what
complexity is.

~~~
tptacek
This seems like a discussion that ends in us arguing about the validity of the
concept of a human soul.

