
Big-O notation explained by a self-taught programmer - maxt
https://justin.abrah.ms/computer-science/big-o-notation-explained.html
======
dizzystar
As a self-taught, I always get nervous when I see articles explaining Big O
because they are almost always wrong. While this one is mostly correct, it is
a bit simplistic and misleading, and it could use considerable annotations.
The "scary" Wikipedia article has examples that are more nuanced than what is
written here; the first example shows a constant time algorithm with nested
for loops.

I take issue with the math fearing throughout the article. Time complexity is
math, period. There is no getting around this fact, and the sooner you accept
it, the sooner you learn that the math isn't that difficult, and the sooner
you realize that, without some intuition on what the math is saying, you'll
never really have a firm grasp of Big O or any of the other variations.

And also, why no mention of log time?

~~~
Waterluvian
As a self taught programmer who did no math in university, the most valuable
thing to me was the discovery that the math really isn't scary. Once you
understand the symbols and notation, most concepts are rather easy, given you
commit time to learning it.

I strayed away for so long until I watched the MIT lecture series on
algorithms and it all just clicked. The best feeling was was seeing the math
on the chalk board for how you determine the complexity of operations on
various graph layouts and just getting it. I struggled in high school math and
rarely got that feeling in class.

It was just one of those, "holy crap... It's all just an array and how we
organize the data in that array that gives various benefits and drawbacks!"
Incredibly empowering.

Edit: This is the series I watched:
[https://www.youtube.com/watch?v=HtSuA80QTyo&index=1&list=PLU...](https://www.youtube.com/watch?v=HtSuA80QTyo&index=1&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb)

~~~
code777777
Would you please share the link to the MIT lecture series? It may help the
rest of us too. Thanks!

~~~
Waterluvian
[https://www.youtube.com/watch?v=HtSuA80QTyo&list=PLUl4u3cNGP...](https://www.youtube.com/watch?v=HtSuA80QTyo&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb)

I used YouTube to slow down and speed up the videos as necessary. Watched some
parts a few times. Had a notepad out.

------
c0achmcguirk
My company always works a "Big-O" question into interview questions. It's
funny how we ask about the complexity of the algorithm and maybe 50% of
applicants even know what Big-O notation is.

It doesn't seem to impact hiring decisions though. We haven't turned anyone
down because they miss that question.

I think it's because anyone who has programmed for a year or so professionally
already understands the concept of inefficient algorithms. They don't need to
measure it mathematically, just learn how to optimize.

~~~
user5994461
That makes sense. Big-O notation is so overrated.

A little bit of history: C++ programmers were always able to choose the right
containers simply by using this image, since long before the invention of
Big-O.

[http://homepages.e3.net.nz/~djm/containerchoice.png](http://homepages.e3.net.nz/~djm/containerchoice.png)

~~~
bogomipz
> "That makes sense. Big-O notation is so overrated."

Can you explain how having a classification that allows one to determine
whether something is logarithmic vs vs quadratic is so overrated?

Isn't that a bit like saying "Algebra is so overrated"?

Also Donald Knuth introduced Big O somewhere around the mid 1970s and C++
didn't come along until 1979.

[https://en.wikipedia.org/wiki/Big_O_notation#cite_note-
knuth...](https://en.wikipedia.org/wiki/Big_O_notation#cite_note-knuth-12)

Also your link is a visualization of ADTs not run times. And while it true
that ADTs are chosen for certain guarantees it still depends on how they are
used in an algorithm.

~~~
mikebenfield
I think (hope?) that user5994461's post was tongue in cheek.

BTW, you are way off saying Don Knuth introduced Big O. It was invented by
physicists and mathematicians like Paul Bachmann and Landau, many decades
before the 1970s.

~~~
bogomipz
I didn't know the OPs post was tongue and cheek. Sometimes it hard to tell : )

Sure "O" goes back to mathematicians in the late 19th century. What I meant
was that Knuth introduced Big O(mnicron) in the context of "Computer Science"
literature.

This is the source I am referring to from 1976 SIGACT News:

[http://www.phil.uu.nl/datastructuren/09-10/knuth_big_omicron...](http://www.phil.uu.nl/datastructuren/09-10/knuth_big_omicron.pdf)

On page 21 or page 4 of the PDF:

"I would like to close this letter by discussing a competing way to denote the
order of function growth. My library research turned up the surprising fact
that this alternative approach actually antedates the O-notation itself. "

On page 22 or page 5 of the PDF:

"The main reason why 0 is so handy is that we can use it right in the middle
of formulas (and in the middle of English sentences and in tables which show
the running times for a family of related algorithms etc.)."

------
achr2
The thing I always run into when discussing big-O are people (good programmers
even) who think all O(x) algorithms have the same efficiency. I find it very
frustrating when someone says my streamlined O(N) algo with 5 operations has
the same efficiency as their O(N) algo with 20 extra function calls and
operations. Big-O is not the only determining factor...

~~~
BjoernKW
In a way Big-O notation intentionally glosses over these differences. At a
large scale 5 vs 20 per instance of n doesn't matter. It might matter for
practical purposes but Big-O really is about making broad distinctions.

~~~
_greim_
Yup. Achieving a better big-O could for example be the difference between
something being possible or not, whereas fine-tuning an algo without altering
its big-O might be the difference between needing five servers or ten. Still
relevant, but a different class of relevance.

~~~
FreeFull
It can also be the case that an algorithm with a worse big-O complexity might
actually be better because all your inputs are small and the constant
multiplier is lower than the more complicated, better big-O algorithm

~~~
stouset
This suggestion gets trotted out in every discussion on Big-O. In 16 years of
doing this professionally, I have not once encountered a situation where that
would have been relevant. Any case where it even remotely might have been, the
algorithm wouldn't have been hot enough to warrant considering it at all.

~~~
FreeFull
One example would be multiplying large matrices together. A common algorithm
used is
[https://en.wikipedia.org/wiki/Strassen_algorithm](https://en.wikipedia.org/wiki/Strassen_algorithm)
with a big-O complexity of around O(n^2.8074).
[https://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winograd_a...](https://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winograd_algorithm)
has a complexity of around O(n^2.375477), but the constant factors are so
large it is currently never better to use it over the Strassen algorithm.

------
mrcactu5
Don't you want to know if your recipe takes 10 minutes or 10 hours to cook?

~~~
R_haterade
Sometimes. Sometimes you just need to throw things in the oven until the first
one browns up.

I recently had a problem where I needed an adjacency matrix of shortest paths.
It was a choice between dijkstra and floyd-warshall. My dijkstra
implementation kicked the pants off of floyd warshall for this application by
an order of magnitude, which you wouldn't really expect. And the big-O
complexity was the same for both algorithms, it's just that for the graph
structure, the operation count for dijkstra was much lower. It was the first
one to brown up.

~~~
gk101
A few comments just in case this a critical part of your program, and if
running it faster would help:

1\. Using a heap with Dijkstra's algorithm would speed up your program

From your comment I am guessing you are using Dijkstra's algorithm without a
heap, which gives O(n * m) for one source, and O(n^2 * m) for all sources (all
pairs shortest path), where n is the number of nodes/vertices, and m is the
number of edges

You would be able to improve that to O(m * n * lgn) by using a heap, which
improves the time to run it by quite a lot in practice

2\. The time it takes to run the program also depends on the density of your
matrix:

In the case where the graph is sparse: m ~ n, and will result in O(n^2 * lgn)
(or O(n^3) without a heap)

In the case where the graph is dense: m ~ n^2, and will result in O(n^3 * lgn)
(or O(n^4) without a heap)

Compare the numbers above to O(n^3), which is the time complexity of Floyd-
Warshall

So just in terms of time complexity, Floyd-Warshall would be faster in a dense
graph

~~~
R_haterade
Agreed, but this was for a fast script that will probably run 100 times for
analysis purposes--the code itself wasn't the end goal. I just needed
something that could crunch my numbers in 10 minutes instead of 2 hours. :-)

------
castratikron
Didn't know that the "O" in Big-O actually means "order". I guess it doesn't
really matter too much.

If you really wanted to get theoretical, you could also talk about the whole
family, including little-O, big-omega, little-omega, etc. :)

~~~
bogomipz
I don't believe this is correct, I believe O if for Greek letter Omnicron,
just as the other Greek letters are used Big Theta and Big Omega are used in
discussing bounds in time complexity.

I think O meaning order would be a retronym.

------
allan_s
After seeing a lot of peers being mislead by big O I think most of big O
articles on the web, for the sake of clarity or simplicity omits to express
one thing

if we admits functions * square_big_o O(n²) * linear_big_o O(n)

then the statement

time(squarre_big_o(x)) > time(linear_big_0(x))

is not necessary true for every value of x

because the actual complexity could be

    
    
      * 1000+1000*N
      * 2+N² 
    

in which case most of the time you will chose the square one, because
practically it will be faster.

without this in mind, you got peer who tell you that this thing is faster
because the algorithm is O(n) versus O(n²), though we're talking about a
function that will always have n < 10, but yesterday night they read an
article about big O and now they have to throw a "let's think about the big O"
remarks for every single problem we meet

~~~
Bahamut
Except if one knows what the definition of big O is, there is nothing to be
misled by. What big O tells you is what is the function bounded by after a
constant lower bound on the input.

One always want to do performance testing to see whether it comes into play
even when it comes to more practical applications such as in code.

------
adsofhoads
Extensively incorrect.

