He's not benchmarking an O(1) operation against an O(n), or anything surprising, or even pointing out a "hidden" O(n) operation, it's simply a demonstration that O(n) + O(1) = O(n).
The main point is that the big O notation ignores the cache misses.
- try writing a quicksort as a template to avoid the comparison function calls
- in the sorting loop, switch to an insertion sort when buckets have <= 32 elements
You will see speed improvements of 2 to 3 times faster than a vanilla implementation of quicksort.
Although this approach is slightly more complex on the abstract level, it's faster because it reduces lots cache misses.
O(n) + O(n) = O(2n) = O(n) = O(n + 1) = O(n) + O(1)
Cache performance is just one example of a constant factor, right?
The point is to use data that was recently fetched into the (faster) cache memory as much as possible instead of incuring the penalty of a cache miss
Cache performance really depends on memory access patterns.
Anyways :) I'm being pedantic here, I should probably go back to work.
Yes, random access (missing the cache every time) may be 100x slower than sequential (hitting the cache almost always), but if you're iterating through an array twice as large, it will still be 100x slower.
I don't know about you, but I don't sort 100k records in a single batch, and if I am it's because I messed up. But I might sort different batches of 100 records 1000 times a minute.
If your software engineering begins and ends with Big-O notation, your likely doing a terrible job. Measure, and test. Always.
As a field, do we measure and test the effect of using various programming paradigms, patterns, and coding standards? I don't think most companies do this. Most developer decisions are made by "gut." As a field, we are still more like the "before" of Moneyball than the after.
This seems to be entirely forgotten by the "measure, test" crowd - the scientific method requires a hypothesis to test for measurements to be meaningful, and separately not all things that are produced are "tested" in every aspect, since many things are well established, and the application of that existing theoretical framework makes measurement and testing a waste of time.
Having a theoretical understanding of the thing you care about is far more important than the "measure, test" step - which is certainly important, but by itself of really limited power.
i.e., it's not about checking your hypothesis, it's about checking the method by which you formed your hypothesis.
Checking your hypothesis is fine, but of value only to reject, not to actually find a hypothesis you can accept.
That's the simplistic grade school version. In practice, characterization and measurement is part of a feedback cycle that is used to develop and refine hypotheses. Experiments need a hypothesis to test, but measurements do not require experiments. Where the "measure, test" crowd usually goes off the rails is in not entering that scientific feedback cycle after the initial measurements and instead jumping to conclusions based on gut feel interpretation of them.
However I disagree that measurements have any meaning unless they are conducted as experiments, or potentially as exploratory work looking for an experiment to conduct. Even such exploratory work needs to be driven by at least the loosest of hypotheses, since measuring everything is infeasible. So this is just as much an experiment, just one you don't have a high expectation of predicting the outcome of.
Either way, this recognition of the necessity of at least a minimal hypothesis in all of these scenarios is typically lacking is my fundamental point.
The cycle of revision of that hypothesis is clearly the next step and, as I say, I thought clearly implied.
> Either way, this recognition of the necessity of at least a minimal hypothesis in all of these scenarios is typically lacking is my fundamental point.
That's an overly broad definition of hypothesis that results in every decision-making process fitting the model. Even the "measure, test" people you were complaining about have that. They formulate a "minimal hypothesis", take measurements, fix what they see, and call it a day. What they don't do is iterate on the process and get down to the root causes.
Furthermore, in some closed systems measuring everything (at some level of granularity) is feasible. It leads to a troubleshooting cycle of continuously focusing the measurement towards an indeterminate point, with the specific point being the answer. This kind of fault isolation is much more productive, in general, than continuously developing and testing hypotheses about what specific things might be an issue.
You don't have to start with measurements. You can often do some analysis before you have anything to test, and that may save you a lot of time measuring, testing and tinkering with something that doesn't have a prayer of working. If you are measuring the environment that the system will interact with, you have already done some analysis in determining what to measure, and developed some theories about what is going to matter; that is hard to do without some causal reasoning. If your tests show you have a problem, you need to think about what is going on in order to find a solution.
There's a curious anti-intellectualism in software development that is opposed to any suggestion that thinking about problems is useful. It seems to go hand-in-hand with a simplistic world view that only sees dichotomies, together with the assumption that there can only be one answer.
I'm not making one. I'm saying that measurements don't require an experiment to be meaningful.
> You don't have to start with measurements.
I didn't say you did, I said you can. The person I was responding to claimed otherwise.
> There's a curious anti-intellectualism in software development that is opposed to any suggestion that thinking about problems is useful. It seems to go hand-in-hand with a simplistic world view that only sees dichotomies, together with the assumption that there can only be one answer.
As someone who constantly rails at the lack of engineering rigor in software "engineering", I agree completely.
(the business people always pick the worse solution that sounds better unless you can explain why they should care, in english, meant for normal human beings, without sounding condescending about it)