Hacker News new | comments | show | ask | jobs | submit login
What does O(log n) mean, exactly? (stackoverflow.com)
145 points by martindale 1289 days ago | hide | past | web | 92 comments | favorite

The example in the question seems to demonstrate a common misunderstanding with regards to Big O notation. It measures the time complexity of an algorithm in terms of its input size. But the n in the OP's example is not the input size, it is the input value.

In the actual implementation, the input size is of course constant (being an int), but if you look at the intended algorithm it is (I have renamed the variable into b on purpose, to avoid confusion with the n usually used in the Big O notation) we have something like:

* Take a number b.

* Loop from 0 to b-1 and print the number of the current iteration.

As far as this algorithm goes, the input size here is not b, but of order log(b) (number of digits to represent b), while the algorithm's runtime is proportional to b. That means this algorithm has exponential runtime.

Your first paragraph is also a slight misunderstanding of big O notation. In fact, the notation allows to make general comparisons between the growth rates of arbitrary functions, not necessarily referring to time complexities of algorithms (although that is a frequent use of big-O notation). For example, sometimes you use big O to express the amount of space an algorithm uses, sometimes you just count some objects with a certain property, etc.

When it comes to time complexity, one always has to clarify what is the computational model assumed. What are the allowed operations? What are we counting? Are we counting the number of comparisons between arbitrary (comparable) elements? Are we counting operations on bits? Are we counting arithmetic operations on numbers that fit in a register of some idealized computer? Of course if the model is not agreed upon, the answers can differ, as in your example.

Once the model is clear, and you want to express the running time of an algorithm, the result can in fact depend on both input size and input value (although this distinction makes more sense in certain models than in others). An example is the Ford-Fulkerson algorithm for max flow, whose running time (in the RAM model) depends both on the size of the network and the value of the max flow.

> In fact, the notation allows to make general comparisons between the growth rates of arbitrary functions, not necessarily referring to time complexities of algorithms (although that is a frequent use of big-O notation).

True, I was simplifying by restricting to the context the OP was asking about (time complexity of an algorithm).

Yes, I didn't mean to sound pedantic - just meant that there is no single way to measure time complexity of algorithms - when we talk about sorting, often we assume the comparison model: only comparisons between elements count, and we can't do any arithmetic on the elements - then we have O(n log n). In another model we assume the elements are numbers of reasonable size and we can operate on them in constant time (a somewhat realistic model of a CPU) - then we can sort in O(n).

That's why I think it is good practice to separate the understanding of the mathematical concept from the study of algorithms and just be clear about what we are counting.

I get your point now, and it is indeed an excellent one to make.

In my opinion the misunderstanding arises due to the lack of understanding of what log n means, and not so much O notation.

Intuitively, logarithms are much easier to understand in the context of exponentials. For example, if there is "something" (be it a function, space, money, etc.) which starts at value "c" and doubles with each iteration: c, 2c, 4c, 8c,..., that is what we call exponential growth and iteration n would be represented by the following notation: c * 2^n.

Where does the logarithm come in? It is defined in the following way:

   if a^b = c, then log_a c = b   (note: read as log of c in base a equals to b)
so an exponential relationship can always be written in terms of logarithms.

Now, if the answer that we are looking for is not the actual value, but how many iterations did it take to get there ("b" above), that is the direct result of applying the logarithm to the function.

How does it all apply to functions execution time and Big O notation? Well, if we have a space N that we are searching over via a function F, and this function does so iteratively by dividing search space N into 2 at each iteration and discarding the wrong half, we will have the following sequence representing the space to search over at each iteration: N, N/2, N/4, ..., N * (1/2)^n.

If the question is, how long will it take this function to run? The answer is that starting at space N, the function will run until it has narrowed down the space to 1, so the search is over and it found (or not) what it was looking for. So the question is really, how many iterations will the function need to do? And as explained before this is exactly the definition of logarithm.

Therefore: ('~' defined as 'proportional to')

   time to run ~ space to search ~ # of iterations ~ log_2 (N*(1/2)^n) = constant + log (n)
In general, intuitively (not always true though) if there is a function that at each iteration is dividing the size of the work to be done by a constant factor (2,3,4,...) the run time, that is the number of iterations, will be O(log n).

OP's example accepts a parameter, and that parameter is the input size, not any of the input values. In fact, in the example, there are no input values, the function is just doing busy work -- although, I suppose it's also possible to say that i (0,1,2,3,...,n) are the values, since that's what he's printing. However, as you said, the individual input values are irrelevant to the complexity.

Consider this slightly modified function:

    f(string[] phonebook, int phonebookSize) {
      int i;
      for (i = 0; i < phonebookSize; ++i)
Here it's clear that phonebookSize (n) is the input size, not an input value -- however the function is essentially the same otherwise.

EDIT: Sorry, I understand now. You are saying that the way he's defined his function is what determines the input size. Since it accepts one integer, the complexity could be described as O(2^n) (where n is the size of the integer in bits, or digits, or whatever). I had made the assumption that my updated function is what OP's intention was.

I'm confused, wouldn't this be more of an O(n) instead of an O(log n) issue? If not, then what is the difference? No pointers being used, not even a data structure, just an array of strings. I expected at least a linked list data structure for a phone book. Like sorted in alphabetic order or something.

Correct, OP's example contained an O(n) function:

> For example, the following function is O(n) because the algorithm grows in proportion to its input n: [...]

OP's question was whether there exists a similarly simple example of O(log n), which is not related to what anonymouz was commenting on. (or I'm misunderstanding?)

The value n often stands for the input size measured in bits, but not always. As long as you make it clear what n means, there is nothing wrong with using O-notation for that n. O-notation is just a way to denote a set of functions. This notation is also commonly used in mathematics and physics, where it doesn't even have anything to do with algorithms.

Of course, but the OP is not asking about big-O-notation in mathematics in general. He is specifically asking about what it means when applied to time complexity of algorithms.

I've always read O(n) as "The worst-case (time|space) complexity in respect to n" where n is typically the input size, but could be any defined value (in this case, the value of b).

I'm confused by this.

His function f(x) is of linear time-complexity (with respect to the value of x). Assume it takes time c1 to print a number, this I will create a function g(x) = c1*x + k, where k is a constant representing the overhead. Then f(x) < g(x) and g(x) element of the set O(x).

I am not sure how that makes his function run in exponential time. Am I misunderstanding something?

The function runs in linear time relative to the value of x. However, time complexity refers to the size of the input, not the value. For example, binary search's O(log n) time complexity refers to the number of elements n in a list(n varies with the list's size). x + 1 will always increase the value of x, but not necessarily the size. The size of x is the number of digits in x. f(x) runs once if x=1, ten times if x=10, one hundred times if x=100, etc. So each time x's size increases by 1, f's runtime increases by a factor of 10.

It's a very literal definition of time complexity and it's counterintuitive, but ultimately it does make sense.


It's the difference between a function taking a list and a function taking a number.

A list's size depends on the number of elements in the list. Adding an element to a list increases the size of the list.

A number's size depends on the number of digits required to express the number. Number++ will always increase the numbers value, but not necessarily the number's size. anonymouz's point is that we should describe f(x)'s runtime in terms of the size of x, not the value of x. While I agree that the natural interpretation is to use x's value when x is a number, and x's size is going to be fixed anyway, f(x) is exponential in relation to x's size and linear in relation to x's value.

Put another way: 4 and 5 have different values, but the same size: we're expressing both with 1 digit. 4 and 10 have different values and different sizes, as we require 2 digits to express 10 and only 1 to express 4. Of course we'd probably be dealing with binary, but it's the same idea.

It's really a question of semantics/how you interpret OP's example and the natural interpretation leads to the function being O(n). However anonymouz makes an interesting point that I hadn't considered before.

But the size of the number is such a programming-centered view of things. In my opinion it doesn't at all reflect the mathematics behind it. It's completely dependent on the representation of the number, which could really result in anything.

The only thing that really matters is the size of the set of numbers that you do your computation on.

What's really going on here (not in code but conceptually) is that you have two steps. First you take a number `n` and basically convert it into a generator for the set of numbers `{0, ..., n-1}`. We can assume that to be a O(1) step. Then in the second step you apply a O(1) function for every number in the set which is clearly O(n). So you end up with O(1) + O(n) which boils down to O(n).

So we have functions

`g(n) -> generator for {0, ..., n-1}`

`f(x, funct) -> funct(n) for n in x`

and we combine them into

`p(n) -> f(g(n), print)`

Clearly, the input size of the arguments to `p` is no longer meaningful for the total runtime. You have to assume it is a set of numbers to make any sense.

So in my opinion while anonymouz is technically correct, it does not make any sense to see it that way. BigO notation/analysis is there to gain some understanding about your algorithm complexity and not to link the amount of digits in a number to the complexity of your algorithm.

>So in my opinion while anonymouz is technically correct, it does not make any sense to see it that way.

I absolutely agree, and anonymouz's approach is not how I would approach time complexity. I thought it was an interesting point and so I decided to explain where he was coming from, but practically I don't think it makes any sense.

Consider a function that prints a list of the numbers 1 through N. This function is linear in the size of its input, because we're talking about the size of a list.

But now, make a function that prints the first N numbers. This function is exponential in the size of its input, because we're talking about the size of a number.

We can optimize this second function by replacing our binary representation of a number with a unary ("tallying") representation. Now the size of a number grows linearly with its value, and so our function is linear instead of exponential. Nice win!

anonymouz's interpretation is bizarre and definitely not how I'd think about the function, I'm just trying to explain where he's coming from.

I don't plan on using the interpretation in the future but it was confusing to me too and I thought it was an interesting thought so I figured I'd attempt to explain.

> anonymouz's interpretation is bizarre and definitely not how I'd think about the function.

It is usually understood that one measures the time complexity of an algorithm in the size of the input (only if the input is an integer, would it even make sense to measure it in the value of the input). So you may of course use whatever definition you like, but be careful that statements about the time complexity of certain things are often made without referring to the exact definitions used, and you may then misunderstand them.

For example, when people say "We don't know a polynomial-time algorithm to factor integers" they mean polynomial time in the size of the input (i.e., number of digits/bits of the integer). The problem would trivially be polynomial time if you allowed it to be polynomial in the integer itself.

Sorry, I deleted my post before I saw your response.

Surely the n in the OP's example is the size of the input. The input is the phone book (or a given page in one case, etc) and n is the number of entries.

I sometimes wonder, what sort of people actually need to know about Big-O notation? In about 15 years of working in IT I don't recall anybody ever using it in a work context. Generally something like "this strategy will become very slow when we have a lot of users" seems adequate. Admittedly I'm only a humble web developer. Do you regularly talk in Big-O at work? What is your job?

Edit: Despite being only a humble web developer I understand that fast algorithms are better than slow ones. I'm wondering how Big O notation is used in the real world.

Yes, regularly. I'm a backend web developer who works on a fairly large-scale website, and program mostly in Python. It's pretty important to know that code like this (which we have in our photo search code):

   for image in images:
       if image.hotel_id in hotel_ids:
will be O(N) if hotel_ids is a set or dict (with O(1) lookup), but O(N²) if hotel_ids is a list (with O(N) lookup). We're not talking about micro-optimizations here -- when N is of moderate size, it can mean the differences in render time between 30 milliseconds and 3 seconds.

When developers don't have a feel for this, it really slows them down -- both figuratively and literally.

FYI, I wrote a blog entry about how our photo search engine works here: http://tech.oyster.com/how-our-photo-search-engine-really-wo...

> When developers don't have a feel for this, it really slows them down -- both figuratively and literally.

Of course, I'm just not clear how Big-O helps with that. To me, the required insights seem to be:

* Things are slower when the computer has to do more work

* A feel for what strategies result in the computer doing more work

I can't think of many cases when saying "This algorithm is O(N^2)" is significantly more useful than saying "This algorithm is going to be really slow".

It's like grammar: You don't sit around articulating "this is the subject, this is the object, and I need the third person singular over there" when you write - but knowing the mechanics of grammar allows you to write texts with correct grammar more or less intuitively, and it's easy to reach back and pull out a rule when you think something sounds weird.

At least where I work the definition of slow does not cross complexity boundaries. A O(n) algorithm that takes 100 seconds to run is "slow" compared to a O(n) algorithm that takes 50 seconds to run. But if it crosses a complexity boundary and it's now, O(n^2), we state that instead of saying it's "slow", because often it's far more tragical than just taking a little extra time to run.

But how can you know how slow it is going to be under a production workload? Is very useful to be able to time some code on a small input set, then extrapolate based on knowledge of time complexity to find out roughly how long the full input will take. If you don't know the time complexity, this is impossible.

> But how can you know how slow it is going to be under a production workload?

In a web dev context, load testing. I appreciate that that's not so easy if your program takes a week to run.

> Is very useful to be able to time some code on a small input set, then extrapolate based on knowledge of time complexity to find out roughly how long the full input will take. If you don't know the time complexity, this is impossible.

I suppose my confusion is that with code I've written, I think I generally understand the time complexity without having to use Big-O notation. I can sort of see that there are times you might want to tell a colleague something like "careful, this code I wrote is O(n^3)", but I'm wondering what kind of job entails doing that on a regular basis.

> I can't think of many cases when saying "This algorithm is O(N^2)" is significantly more useful than saying "This algorithm is going to be really slow".

But to know that it will be slow you have to know how it scales with the input size. Because the list example may work just fine when the website has little data (oftentimes an O(N^2)-algorithm may even be faster than an O(N)-algorithm on small data) and hit you when you scale up. And the O-notation gives you a tool to estimate how the work scales. Any alternative to the O-notation is either more complex (look up Analytic Combinatorics http://ac.cs.princeton.edu/home/) or just an informal treatment of O-notation.

I work in machine learning. I regularly get jazzed when I figured out a linear solution to a problem (e.g., using a min heap to grab top k elements from a similarity computation in a recommendation system instead of first sorting), and will remark on such to anyone I'm working with. In more mundane settings, I'll also regularly use, say, hash tables to store data for convenient constant time lookup instead of naively scanning through a list each time I need to find an index.

I think I can safely say that asymptotic time and space concerns of a problem firmly guide much of my choices in data structure and algorithm use. Though this is maybe followed closely by aesthetic pleasure, novelty of trying something new and neat, or clarity in the functional organization/readability/maintainability of the code.

Even a simple PHP script that needs to work with a large data set can quickly run into problems if you don't pay attention to algorithm complexity.

Consider this simple script that first creates a dataset of strings "x0", "x2", "x4", "x6" etc. (500000 entries)

The task is to count from 0 to a 1000 to see whether the string "x"+NUMBER is present. Using a linear search with array_search() is super slow, but if you switch the storage from array values to array keys, and use a hash lookup via array_key_exists(), suddenly it flies:

% time php -dmemory_limit=512M -r '$entries = array(); for($i=0;$i<1000000;$i+=2){ $entries[] = "x".$i; }; $hits = 0; for ($i=0;$i<1000;$i++) { if(array_search("x".$i,$entries)!==false) $hits++; } var_dump($hits);' int(500) php -dmemory_limit=512M -r 11.25s user 0.09s system 99% cpu 11.362 total

So, 11.362 seconds for a linear algorithm.

% time php -dmemory_limit=512M -r '$entries = array(); for($i=0;$i<1000000;$i+=2){ $entries["x".$i]=true; }; $hits = 0; for ($i=0;$i<1000;$i++) { if(array_key_exists("x".$i,$entries)) $hits++; } var_dump($hits);' int(500) php -dmemory_limit=512M -r 0.35s user 0.06s system 99% cpu 0.419 total

And 0.419 seconds for the hash one.

If you increase the workload to counting from 0 to 10000, it gets even worse in the linear case, while the hash key lookup finishes at about the same time as earlier!

% time php -dmemory_limit=512M -r '$entries = array(); for($i=0;$i<1000000;$i+=2){ $entries[] = "x".$i; }; $hits = 0; for ($i=0;$i<10000;$i++) { if(array_search("x".$i,$entries)!==false) $hits++; } var_dump($hits);' int(5000) php -dmemory_limit=512M -r 115.96s user 0.29s system 99% cpu 1:56.52 total

So, 1 minute 56.52 seconds for a linear algorithm.

% time php -dmemory_limit=512M -r '$entries = array(); for($i=0;$i<1000000;$i+=2){ $entries["x".$i]=true; }; $hits = 0; for ($i=0;$i<10000;$i++) { if(array_key_exists("x".$i,$entries)) $hits++; } var_dump($hits);' int(5000) php -dmemory_limit=512M -r 0.37s user 0.06s system 99% cpu 0.440 total

And 0.440 seconds for the hash one.

keep in mind that since PHP must preallocate the array() to an arbitrary 'sane' size, it has to 'resize' (create a new one, copy it over, point reference, destroy old one) a large number of times here.

therefore, most of the time spent here will be malloc'ing (mmap, memcpy). the rest of it will be on executing the php interpreter, compiling the script, and cleanup.

the hash search and comparisons are so much faster than those operations (in PHP, especially), this isn't really a valid way to measure your thesis

> keep in mind that since PHP must preallocate the array() to an arbitrary 'sane' size, it has to 'resize' (create a new one, copy it over, point reference, destroy old one) a large number of times here.

Those allocations only happen once in a while (because at each reallication, the size of the array doubles), so the cost is amortized. I think that you only get about lg2(1000) = 10 allocations in total for an array of 1000 elements.

fair point, but they're rather costly due to the amount of memcpy'ing in my experience..

you've probably seen/know this, given that you know PHP internals :), but for others: http://nikic.github.io/2011/12/12/How-big-are-PHP-arrays-rea...

The initial array creation + startup/parse/compile/teardown should be quite constant between runs (probably most of the 0.4 seconds baseline). I think the 11sec->2min vs 0.41sec->0.44sec time increase illustrates the algorithmic complexity quite well, no?

I would argue that having a basic Big-O intuition is important for every developer. Regardless of the work you're doing, you want to write efficient code. And you want to know how something scales.

I'm also mainly a web developer. In my head, I'm thinking about Big-O a lot - not that I'd go and formally examine every bit of code, but keeping O behavior in mind leads to certain almost subconscious decisions that are going to be good for your software.

Being Big-O-aware makes sense especially in web development where latency minimization and stable scaling behavior are important. Keeping Big O in mind influences a lot of choices, from using certain language constructs to the way your database tables are modeled.

Thinking about Big O is part of being aware what your code actually does.

> Thinking about Big O is part of being aware what your code actually does.

I don't get this at all. Surely Big-O is an abstract way of describing a property of your code to somebody who doesn't want to read it. If Big-O gives you a better understanding of what your code does than writing it did, that suggests you wrote it by applying some kind of genetic algorithm and you should probably go back and look at it again until you understand it.

Maybe I didn't express this very clearly. I argue that it's paramount to understand what your code actually does. As simple as this sounds, in my experience a lot of developers just don't care. I'm not talking about applying some abstract CS concepts, I'm talking about getting a feeling for the amount of computational effort and memory consumption your code causes.

Once you understand your code's behaviors, you understand its Big O. It's not necessary to be explicit about it, but this is one of the rare cases where theoretical notation and engineering intuition do map very nicely. If you can say things like "my code's CPU cycles go up linearly with variable n", that's the equivalent of understanding Big O.

So I'm not advocating blindly "learning" about Big O, I'm trying to say that once you get a good sense of what your code is doing you get Big O understanding automatically.

> Surely Big-O is an abstract way of describing a property of your code to somebody who doesn't want to read it.

It can be, but that's not useful. I think of it as a micro vocabulary that comes naturally from understanding code. It's really not an abstract concept, it's a very concrete attribute a section of code has.

If you're working with a dumb API/library (i.e. Sharepoint) where the correct default search algorithms aren't very easily accessible, it's easy to fall into the trap of using linear-time search instead of a hashed or binary search strategy, for instance. Things will look fine until your system has been used for a while, then things will gradually slow down to a crawl.

So yes, I consider knowledge of algorithms and run-time analysis an essential software development skill (I experienced this after only 6 months on the job, so obviously our experiences differ. Consulting SharePoint developer for small businesses).

I worry about Big-O almost every day. I run an analytics startup and scaling to huge datasets is important. Sure, I could throw hardware at the problem, but even that has its problems and bottlenecks and would also inflate cost. Instead, I carefully handpick the algorithms used (where possible) to have reasonable time/space complexities for the tasks I want to perform. Not everything needs to be carefully tuned, of course, but for the bits that do, Big-O is a useful tool.

Before this, I've used Big-O when writing some hobby game engine code to make sure that my algorithms scaled nicely as number of game objects increased.

I've also used Big-O to compare algorithms in back-end web development. Usually, I didn't care, but there were a few cases where it mattered (one specific one I remember was sending out notification emails based on a complex criteria and the operations I was measuring with O(n) was number of database queries).

Big-O is just a formal notation for comparing growth rates of functions. You are right of course, that often one might get away without the precise formalism and rely only on a gut feeling - but there are limitations on how many things we can compare mentally.

A bit tongue in cheek one could make the following analogous statement: does anyone use greater-than or less-than or equals signs at work, as in something being "> 5" or "< 10"? In my experience "this quantity is much larger than that other quantity" or "these two quantities are roughly the same" is perfectly fine.

But what do you do once something becomes very slow? Creating a database index, for example, is a beautiful example of O(log n) :)

Wouldn't inserting a single record in the index be O(log n) (if n is the number of records in the index), and creating the entire index be O(n log n)?

But in that case, knowing Big-O notation is a retrospective explanation of something any quarter-decent developer knows already. You don't need to know Big-O in order to know how/when to create a database index.

It doesn't have to be retrospective. A knowledge of computational complexity can be used to make predictions about the performance of a system (even one you haven't implemented yet).

What if your manager asked you for an estimate of how long it would take to load a million new records into the database and index them? It would be nice to know whether you'd have to take the system down for an hour or for a day or for a week.

What if the performance of your code was slow and you wanted to know whether it could possibly be improved, and by how much. Knowing the theoretical limitations of the algorithm you were running (e.g., sorting takes at least O(n log n)) could tell you what kind of optimizations you could reasonably expect to make or whether you needed to invest in faster hardware.

> What if your manager asked you for an estimate of how long it would take to load a million new records into the database and index them? It would be nice to know whether you'd have to take the system down for an hour or for a day or for a week.

I don't think I really buy this as a practical use case. Unless I was running on system I understood very well (eg. an embedded platform) I don't think I'd really trust estimates calculated using Big-O to be reliable on atypical workloads. I'd pretty much expect unforeseen caches / optimizations / swapping to disk to confound my estimates somehow. If at all possible I'd test it. If testing it takes too long to be practical then there's a serious problem with the system. I can see cases where testing is impossible for good, practical reasons, just not enough of them to justify the obsession with Big-O.

The answer provides an excellent definition and some great examples, but I actually prefer the second answer. I think that when it comes to time complexities, developing an intuitive mapping of solutions to time complexities is important.

Developing a feel for how log(n) tends to emerge in common programming techniques is much more practical than specifically checking your code for situations where "the choice of the next element on which to perform some action is one of several possibilities, and only one will need to be chosen."

That being said, I'm not the OP and it's very possible that the phone book examples are more helpful in developing that understanding for other people.

I think the first answer's intuition is actually better. It's stated a little opaquely, so maybe it's not actually useful as a teaching tool until you already understand why binary-search type algorithms that rule out a significant fraction of the input at each step are O(log n).

Basically, phrasing big-O as a property of the algorithm rather than the tree-structure of the solution space is more helpful in practice. I think it's quite common to have some problem in front of you and say "OK, so I can see immediately there's a naive solution thats linear, is there a smarter way?" And the answer nearly always comes (for me at least) when I try to answer the question, "Can I find some way to make smarter decisions at each step?" If the answer is "yes" or "maybe", you might be able to get to O(log n), but quite often the answer is "no" because you're streaming random data or something, or you have some logical reason why you have to look at every single input object, and you can immediately rule out better solutions and just get to work on the linear one.

That insight rarely comes to me when I ask "Is there any way to think of the input as a balanced binary search tree?" If there actually is such a tree structure, you're bound to find it anyways by asking "What smart decision am I making at each level?"

I agree that I rarely find myself asking whether I can represent input as a binary search tree, I just find the binary search tree image a very effective method of demonstrating how logarithms play into this at all. It's an easy example of log n complexity.

That being said I'm not sure how much my speculation is worth here. I'm very comfortable with the idea of log n based complexity and don't remember how it was taught to me in the first place. Given that OP accepted the answer and that it was significantly more votes, it seems that it's helpful for more people.

The entire first few chapters of CLR/The White book are dedicated to explaining this in great, patient detail.


Just pick it up.

I find both the questions and answers confusing, I think this is a case when a little more "formal" understanding would help before jumping into "intuition" and big picture explanations.

If the question is really what O(log n) means, then the OP should first try to understand the definitions of Big-Oh notation, asymptotic growth rate, etc. At this stage it is premature to talk about running times and algorithms, these are simply statements about functions. Maybe the OP is already past this stage, or at least has a rough understanding of these concepts.

If the question is where does O(log n) arise in practice, then one can explain binary search, binary trees, the representation of a number in a certain base, harmonic series, etc.

Did anyone make sure this guy knows that log in Computer Science is base 2? We need to remember that most places simply writing 'log' is assumed to be base 10.

Log base 2 is just how many times we can you recursively split something in half, which is not that complicated of a concept. However, his log example is base 10 and if he was trying to figure out these things as base 10 it would be super confusing and not make any sense.

Well, it's like this -- log_2 and log_10 (and natural log) differ only by a constant multiplicative factor, so they're all the same asymptotically, and it doesn't matter which one you use inside the O(). Pretty neat right?

Yes, you can convert log base x to log base y. However, the point is that if OP is assuming that we're using log base 10, developing an intuitive understanding of why a binary search is O(log n) is significantly harder than if OP knows that we're referring to log base 2.

This is correct, but in case it makes more sense this way (it does to me):

log (base X) Y === log (base e) Y / log (base e) X

So the difference between log (base 2) and log (base 10) is just the difference between being divided by log (base e) 2 or log (base e) 10. Since those are both constant factors, big-O notation doesn’t care.

It's actually the same statement after all just with and without the actual math. But yes, it's very easy to see it from this.

Yeah totally for the purposes of big O theory its super neat. But before you can do big O you need to be able to actually calculate the number of steps a function takes to complete. If your a student trying to figure this out for the first time and you want to divide and conquer a list of 8 elements you do 3 operations, thus you get log_2 8 = 3 or log_2 (n) = # of operations. This is then generalized in big O notation but I think trying to do big O before being able to calculate steps for an example function is putting the cart before horse.

For what it's worth, when I started studying this stuff I assumed log(n) meant log_e(n). That didn't help much either.

Why is O(N * N!) not written as O(N!)? It seems that N * N! < (N+1)! and we don't care about constant factors.

O(N!) is not the same thing as O((N+1)!).

O(f(N)) = O(f(N+1)) is true for exponential functions and less quickly growing functions but not for things past that. (And it's not true for sufficiently quickly decreasing functions, either, in the other direction, e.g. where f(N) = 1/N!)

N is not a constant factor.

Why doesn't anyone actually mention what Big O (of anything) actually means? Instead of jumping directly to a CS-related example, shouldn't the definition be made clear first?


Nobody explains Big O because the OP demonstrates that he understands the idea of Big O by giving the complexity of a piece of code and explaining how the complexity would change if the code changes.

OP is confused with the idea of O(log n), not of Big O in general.

It took forever for the coin to drop for me on log n, and it seems it hasn't dropped for the accepted SO answer either.

log n is the height of a tree when the log base and the branching factor of the tree is the same.

Hash lookups are log n, because the hash table implementation is typically a tree.

A binary search tree is O(log n), but that's not what's typically referred to as "hashing".

Hash lookups can be close to O(1) (constant time) if you use a hash function that calculates a close-to-unique index into a table based on the key's value (provided the table isn't too full, so there are few collisions). This kind of data structure is typically used for looking up identifiers (variable and function names) in a compiler.

More information here: https://en.wikipedia.org/wiki/Hash_table

I found CLRS's explanation of divide and conquer gives a very intuitive understanding of what a logarithmic time algorithm is doing and why it runs in logarithmic time.

A little bit of nitpicking on the accepted answer: the list seems to imply that these are "the" possible runtimes an algorithm may exhibit. But there are many cases where algorithms run in times that lie between these. Perhaps related are log-star and O(log log N) algorithms. These are even more interesting to me than logarithmic ones.

Not sure of the universality of this, but my intuitive way of understanding O(log n) is as follows:

An algorithm is O(log n) if you can reduce the solution space (the set of all possible solutions) by a fraction (e.g. divide it in half or divide it into tenths) using a constant number of operations.

For example, binary search is O(log n) because you can cut the solution space in half (regardless of how big it is) by using a constant number of operations.

The use of a phonebook analogy is clever, but I think the answerer should modify his answer to cover the O() running time of a trie data structure. A phonebook is essentially a trie. He talks about the "divide and conquor" approach to navigating a phonebook, but that's not how a computer would store the names. The way a computer navigates a phonebook would be interested in read about.

A phonebook as trie would only apply if you had tabs for each letter all the way down. It's a vector of pages.

Maybe after this discussion PG will write an article that going to the uni is not such a bad idea after all :)

An important fact to take in account about Big-O is that it represent the upper bound of the time or space complexity.

First associations for me - sympathy, elegance (without even looking at the code), thought about that the data should be prepared (most likely sorted), thought about that this might be one particular operation and [data structure] might imply that other operations could have different O's.

Well, and finally I'll actually look at it (click on the link).

it describes those infinite sequences of real numbers that grow no faster than logarithmically, ignoring constant factors and anything the sequence might do for a finite number of terms at the start.

assume f : N -> R. by "f in O(log n)" we mean there exist constants c >= 0 in R, m in N, such that for all n in N, if n >= m then f(n) <= c log(n)

Order of growth. A shape of a function. Execution time (or memory usage) of a procedure as a function of the size of its input.

The phone book example was genius. I wish they would have taught that in school.

They usually do, if "school" is a computer science (or surely even just mathematics) program.

It is designed to estimate the amount of time it takes to run an algorithm using a data structure like a binary tree instead of a different one like a linked list O(n)instead for linked lists. I am trying to learn it myself, so bare with me if I make any mistakes.

They throw words like polynomial and I've asked people with comp sci degrees to define what they are. I get back replies like "We learned it as a polynomial and never got a definition of it." OK look, here is what it means and some people with a comp sci PHD cannot even tell me, poly means many, right, and nomial like numbers or formulas or rather in computer science and algorythm. You have to study how algorithms work, I asked this on Hacker News and many tried to bury the story because most couldn't even answer that, and thanks to those who did. I am glad to see someone else bring it up and hope it sheds more light on it.

Consider this, that you really cannot get a good estimate of how long any algorithm can run because you got other factors in play. You got multicore CPUs now, and some of the work is offloaded to GPUs as well (like Bitcoin mining programs using GPUs, well now algorithms can use GPUs for some of the processing), plus you got caches on processors you didn't have in 1971, not only that but the law of thermodynamics applies to computers and electronics and many people forget that, so if your CPU is over heating it could slow down if you have a variable fan it could cool the CPU down and it runs faster for a bit and then heats up and slows down a bit, most operating systems are multitasking so the program that runs the algorithm might take longer if the Apache HTTPD server is in use with heavy traffic on the server the algorythm runs on, not only that but Anonymous might have sent a DDoS attack to your server the algorithm is running on. I hope those with a PHD in computer science figure those as factors when they explore this issue.

>It is designed to estimate the amount of time it takes to run an algorithm using a data structure like a linked list instead of a different one like a binary tree.

Wat?? Excuse me, but it seems you have no idea what you're talking about.

>Consider this, that you really cannot get a good estimate of how long any algorithm can run because you got other factors in play

That has absolutely nothing to do with the complexity of the algorithm. Also, it _can_ be estimated, too.

I wonder if you're trolling or just clueless

> I wonder if you're trolling or just clueless

Given the curious interpretation of ‘polynomial’ by means of an only somewhat correct translation, I’d tend to the latter.

Well I admit I don't understand it fully. That is no reason to accuse me of trolling or downvote my comments.


Is Wikipedia not correct on this issue? I was told on one hand to look stuff up on Wikipedia, and on another I found out it is not always correct.

‘poly’ means many, yes, and ‘nomial’ in this context can stand for ‘terms’ or ‘expressions’ (not really numbers, but maybe formulae as you said). However, ‘polynomial’ does not mean ‘many terms’ when used by mathematicians and, in extension, computer scientists.

A polynomial function, in this context, means a function that is/can be expressed as a sum of powers (usually restricted to a finite number of addends) of its input variables with appropriate prefactors. For example,

  f(x) = 1
is a polynomial function, but it certainly does not have many terms (nor numbers). That the finiteness of the sum is an important requirement when it comes to complexity can be seen by the fact that the exponential function can be expressed as an infinite polynomial:

  exp(x) = 1 + x + 1/2 x² + 1/6 x³ + …
In other areas, where people are more concerned with nondivergent or differentiable functions, this requirement is often dropped and nearly everything that doesn’t have poles in x is called ‘polynomial’.

Furthermore, you didn’t only get downvoted for misinterpreting polynomial, but also for the strange mixing of theoretical CS, which is concerned with the complexity of algorithms at hand, and real-world effects such as slowdowns due to other programs multitasking, CPU caches etc. etc. which are rarely taken into account when deciding whether a given algorithm is remotely feasible to implement (runs in time, say, n^10 or faster) or just too slow.

Ah I see, I am exploring theoretical areas of computer science for which there are no clear answers yet. Downvotes for you, bad HN user for making us try to think of new theories. :)

So a Polynomial doesn't need be many of them, it can be just one of them. The poly in the name threw me off I admit. I would have assumed that was a mononomial for just one and two of them are a binomial but apparently polynomial is used instead even for only one or two? I hope you understand my confusion there. Thank you for clearing that up.

Ah I see, I am exploring theoretical areas of computer science for which there are no clear answers yet. Downvotes for you, bad HN user for making us try to think of new theories. :)

In science in general, including computer science, "theoretical" doesn't mean fringe or unexplored, it means foundational and abstract. CS theory uses the word "theory" in the same sense as music theory. CS theory deals with well-defined, rigorous analysis of computation in abstract or mathematical terms.

I would have assumed that was a mononomial for just one and two of them are a binomial but apparently polynomial is used instead even for only one or two?

A two-termed polynomial is also called a binomial. "Binomial" and "monomial" are special cases of the more general term "polynomial".

Just as poly- and mono- are prefixes of Greek origin[0], you’d probably use ‘di-’ rather than ‘bi-’ here (binomials are a little different, again), just as you have monosaccharides, disaccharides and polysaccharides[1] (aka carbohydrates).

The relevant point is that it usually doesn’t matter whether you have one, five or 500 terms in a polynomial, as the largest one will certainly dominate for sufficiently large input sizes[2] and all terms in a polynomial essentially behave the same way (being differentiable, having no poles etc.).

[0] The only thing wrong with homosexuality is smashing a Greek prefix onto a Latin root.

[1] Latin: uno-, bi-, pauc-, multi-, Greek: mono-, di-, oligo-, poly-

[2] cf. http://www.wolframalpha.com/input/?i=x%5E20%3B+x%5E20%2Bx%5E...

Well I am trying to understand it better. Instead of explaining it to me better, I get downvoted instead.

Please explain it further and tell me how you can estimate the time when a DDoS attack is being done on the system, or the CPU is overheating, I'd really like to know how you estimate that. Apparently I'm clueless and in need of many clues.

Your comment was not phrased along the lines of "Can anybody please explain this?" It was phrased as though meant to impart wisdom. That's why it got downvoted and why people are correcting you — because you (apparently) meant to ask a question but ended up making a bunch of erroneous claims instead.

Big O notation is an expression of a theoretical concept that applies to the algorithm itself. It is independent of any hardware that might be used to implement an algorithm.

Specifically, asymptotic complexity is an expression of the number of abstract operations or units of memory an algorithm will require. The meaning of "operation" or "unit of memory" depends on the algorithm. In the case of comparison sorting algorithms, for example, the operation in question is the number of comparisons.

Even if the CPU time allotted to a running algorithm changes due to system load or overheating, the algorithm still performs the same number of operations. Actual running time under system load can be calculated, but this is unrelated to big-O notation or complexity.


Regarding your explanation, it is excessively wordy and uses awkward phrasing. Your subsequent complaint about the terminology used in responses to your previous questions about P=NP demonstrates a serious lack of prerequisite knowledge. You learn the definition of "polynomial" in algebra; teaching you algebra is far beyond the scope of a discussion on HN.

As someone who is primarily self taught but also finished most of a college CS education, I can understand your frustration. In order to understand one thing you need to learn three other things, and there's no map that leads you from what you know now to what you want to know. Until someone makes that map, or you know enough to make up that map as you read Wikipedia, you'll have to learn things in the order they're taught in school.

So here's what you should do if you want to understand what people are talking about on HN:

1. (Re)learn basic algebra from start to finish. Use Khan Academy. You absolutely need this to understand what a polynomial is.

2. Accept Wikipedia as a source of information. Use a dictionary and other Wikipedia articles to look up terms you don't understand.

3. Related to #2, try to study cognitive biases, critical thinking, and logical fallacies. Studying these concepts will help your brain process information, such as difficult to understand Wikipedia articles. Check out lesswrong.com.

4. Study basic CS. Look for an Introduction to Computer Science course on Coursera or elsewhere.

5. Study algorithms. Take an algorithms course on Coursera or elsewhere.

I studied computer science at the University of Missouri Rolla. My instructor Dr. Pyron said that this stuff wasn't needed to learn Computer Science and that most of it was hoaky and kind of hinky. But he also told our Freshman class that "Ethics don't matter" as well. So I think I was robbed of that opportunity to learn it. BTW I feel that "Ethics do matter" and told him of my opinion, and he said it wouldn't work in the computer business.

I had a TA for College Algebra who couldn't teach it, and the professor was nowhere to be found. They made Catch-22 Jokes that he was like Major Major you could only see him if he wasn't in his office. It made me feel better but did not help me learn. I was in the Delta Tau Delta fraternity and they helped me out, but claimed the TA had messed up some of the problems that could only be solved with Calculus which I didn't learn yet. To make matters worse I got caught up in the party college thing and underage drinking. I quit eventually. But at least I didn't get forced to meet "Alice" like some unfortunate souls. I still have my Intro to Pascal and Pascal with Data Structures books in my basement, I should re-read them and try stuff with Free Pascal.

In 2005 I had a car accident and almost died, was in a coma, and lost some memories as a result. I may just have to relearn everything all over again. Since I am on disability, free online classes and sources are my only hope. Thanks.

With 25 years of experience in computer industry, you should be explaining us all how we can estimate the time when a DDoS attack is being done:).

Anyways we are talking here about complexity of the algorithm, and not its actual speed or performance which will depend on a lot of factors. But complexity tells for example how many steps it will take for the algorithm to solve the problem. The algorithm can be used by Humans or super computers, so performance can vary from days to microseconds, but complexity will remain same.

I think I covered that in this ebook: http://www.smashwords.com/books/view/314107 :) BTW it is a joke written by MIT's AI SciGen software to show how hard it is to tell if a thesis is serious or a parody written by clever AI software. Even if I mark it as parody and Computer Science Fiction some people are taking it seriously. Even when it says stuff like added 27 25Ghz Intel 386 chips to the network. :) The 386 chip does not run that fast.

OK complexity, then why is polynomial time used to describe it? I know some compilers can optimize machine code better than others, how does that affect the complexity? If the performance can vary from days to microseconds, well that is a very big gap. Video gamers always complain about 'lag', and days well that is one heck of a 'lag'! :)

The point of all this isn't to estimate exactly how fast it will take to do a task. It's to compare how efficient one way is against another. Newer, faster computer will do them faster. A DDOS or CPU overheating will slow them down. But that's not the point.

Lets try a crap example I just made up. Imagine you want to go with your friends to see a movie. Here's a bad way to organise that. You drive to your first friends house. Ask what they want to see. Then you drive to each of your other friends houses and tell them what friend 1 wants to see. Then you drive to friend 2's house and ask them. And then drive to each of your friends houses (including friend 1) to tell them all what friend 2 has. Can you see how with a large group of friends, that would be insane? It really doesn't matter if you have a fast or a slow car. It's just crazy. A better plan would be to drive to each friends house, ask what they want to see and write it down on a piece of paper. Then at the end -- you can drive around again and tell everyone what movie you are all going to see.

If you've got 2 friends -- in the first plan you get to visit 4 houses. And in the second plan you get to visit 4 houses. With 3 friends it's 9 and 6. And then 16 and 8. And 25 and 10. 100 and 20.

Now lets say the person doing plan 1 discovers this amazing thing called the telephone. Plan 2 person is still using a car. Person 1 can do things so much faster. They don't have to leave home and can call in a manner of seconds.

But let's think it through. 4 calls vs driving to 4 people. 25 calls vs driving to 10 people. Still a big win for plan 1 at the moment. But how about 10,000 calls vs 200 car trips? 1,000,000 calls vs 2,000 car trips?

Does that help?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact