Hacker News new | past | comments | ask | show | jobs | submit login
Back of the Envelope Calculations [pdf] (yorku.ca)
60 points by jasim on Apr 9, 2013 | hide | past | web | favorite | 43 comments

The Bureau of Labor Statistics says that there were 6,300 Musical Instrument Repairers and Tuners (code 49-9053) in the United States in 2010 and 320 in New York State in 2008.http://www.acinet.org/occ_rep.asp?next=occ_rep&Level=...

But that's a broad category. There are a lot of instruments that need repairs and maintenance that are not pianos, and they tend not to cross instrument categories, or even individual instruments. A flute repair person will not likely also repair oboes, for instance. Also, piano repairs and piano tuning are not always done by the same people.

This method breaks rather quickly because, when you multiply two order-of-magnitude estimates, you get a two-orders-of magnitude estimate. (And so forth.) In the piano tuners example, there are some five independent quantities being estimated. Of course, the answer "I think there are 500 piano tuners in NYC" doesn't sound as impressive when you add "within five orders of magnitude"...

This is strictly true, but doesn't account for the fact that errors tend to cancel one another. For instance, if 2 of the 5 independent estimates are high, one is right on, and 2 are low, you can end up with a number very close to the actual number. And, by "very close," I mean usually within 1-2 orders of magnitude, depending on the quality of the component estimates.


Errors cancel out when they're errors of magnitude, which is what I believe he was referring to, and is often the most important property of Fermi estimates.

If one estimate is off by 2x and another is off by ½x, those cancel out because the exponents are additive.

Edit: Parent deleted comment…oh well.

And yet it typically yields the right answer (in the hands of an expert). I wonder why?

This is actually not a very good estimate. It has many inaccuracies, and those inaccuracies are harder to nail down than the original problem, so you made it harder on yourself.

Some questions about your result: How many piano tuners work at any of those piano tuner companies? How many piano tuner companies are not on yelp? How many piano tuners are employees of instrument stores instead of working freelance/for a tuner company?

Those are all harder to answer without industry information than the simple deductions the handout makes from common sense/knowledge.

I have the feeling it is important that people understand the limits of technology. Even though we like to say something doesn't exist if it isn't on Google, things very often do irregardless.

I always forget how difficult it is to convey sarcasm in written form. My bad.

Sweet! My guess was 6. I proceeded by estimating the number of pianos per person (my guess was 7000 in NYC), then by estimating the frequency of tuning. I came up with ~20 pianos/day needing tuning and guessing the average tuner could handle 3 pianos / day (when I was a kid I recall the tuner taking a couple of hours.) some edge cases like Carnegie Hall (where the piano is probably tuned quite frequently) probably skew the numbers somewhat. Also, I would guess that a city with mostly single family homes, say, Houston, has more pianos/capita than NY or Chicago, based on the pain in the assedness of moving a piano into an apartment, though that may be offset by the number of music schools and cultural institutions.

If this was Reddit this would pretty much be /thread, however I propose the following question:

While this may be a good estimate, what about businesses not on Yelp?

"Professional Piano tuner" doesn't seem the most technologically advanced profession, so I think it's safe to assume that there are several, if not dozens more in existence in the NY area.

This is correct. In my town, there are four used game stores. Two of these stores do not exist online in any shape or form (One locally owned, one a subsidiary of game exchange).

Searching for a seamstress in my town turns up a few results for dry cleaners. A look on craigslist returns a much larger result.

Yelp's database is largely user-driven - businesses do not have to be hip to technology to appear there. The presence of a business on Yelp speaks more to the technological savviness of its customer demographic than the business itself.

That said, I actually know a piano mover in NYC and he's not on Yelp, or any website. His business is very much conducted via word of mouth and physical exchange of business cards.

When hiring engineers for my company, I might ask, what are the trade-offs in using PowerPoint-style slides to make a technical presentation? And I would expect an astute engineer who has thought about technical communication to refer me to either serious


or humorous


discussions of the defects of PowerPoint slides as a communication tool.

The slides shown here cover some interesting ground, and I was glad to see the specific reference to Enrico Fermi, who popularized the kind of estimation question discussed here in certain circles. Certainly, it is beyond dispute that

"Although most engineers remember key numbers related to their field, no-one has every detail at their fingertips

"Hence we need to estimate not only the values of numbers we need, but which numbers are appropriate, and how to perform the calculation

"the emphasis here is on 'order of magnitude' estimates – to the nearest factor of 10

"it is also important to remember that these are rough estimates and to place only appropriate reliance on the results"

The last point is the most crucial. It is vital to remember how much uncertainty there is in your estimates, or overall in your model that includes both exactly measured and estimated values. The model is not reality. It is not easy to get engineers to remember that they have to look into the mechanism of the model to understand how it works. That's why a NASA Mars probe failed when one part of the engineering team used United States customary units while another used metric units to design modules of the probe that were supposed to communicate values to each other.


What are the alternatives though? (to powerpoint)

I love chalk and talk but it puts a lot of load on the speaker. My Electricity and Magnetism professor basically pushed it to the limit I think. He had a carefully cultivated temper that was bad enough to keep 100 students totally silent for 50 minutes and he used about five different colors of chalk, but there are limits to what you can scrawl out on a blackboard.

You can distribute written documents, this gives you maximum opportunity to convey information, but you can't force people to actually read them. Going from a paper memo to email exacerbates this I think. Best case scenario, you can present very clear and well structured information but if you go past two pages you have to wonder if anyone is really paying that much attention. As a "best case example" I'd look at something like this memo that Robert McNamara wrote to LBJ about a month after LBJ took office: https://www.mtholyoke.edu/acad/intrel/pentagon3/doc156.htm . A benefit of written work is that you are committed to your words. But: creating a concise and readable written document that will effectively spread your thoughts is excruciatingly difficult. You can't write a book of background info every time you have a staff meeting.


About remembering that you have a rough model though, I think the key thing is that when you realize that there is something you are interested in measuring and once you have a model then you can start looking for quantities you can actually measure to (in)validate and tighten the model. Always be model checking.

I spent four years studying physics and mathematics, and I only had one professor that used transparent acetate sheets with notes pre-written and examples on them. All my other professors wrote out everything with chalk. Heck, we were excited when they used more than one colour of chalk, because that meant there was something exciting about to come up.

Additionally, having chalk & talk meant that many students showed up to class, instead of just relying on "reading" printouts of slides that the professor made available. I would venture a guess that I would have performed rather poorly in a class that had powerpoint slides, due to my inherent laziness in my early 20s.

Acetate rocks, especially when its well written and you can pop into the prof's office and have a tea and chat and do some photocopying. I learned much more this way.

Chalk, we spent more time dictating and deciphering the babbling fool who can't communicate it all concisely from memory.

NYC is a weird place to estimate in exactly this way. It has more of a lot of types of people who have pianos other than families. NYC has a disproportionate number of wealthy people, and also a disproportionate number of professional pianists, and I would expect both those categories get their pianos tuned rather more often than a typical family.

NYC also has lots of concert halls, and any decent concert hall will tune the piano to be used in a performance on the day of. A hall like Carnegie hall would have several pianos tuned every day.

I no do math, so maybe those things aren't enough to have a material effect on the number of piano tuners.

Book: http://mitpress.mit.edu/books/street-fighting-mathematics

Free PDF edition available there.

Jon Bentley has a chapter on this topic in the performance section of Programming Pearls. This is a very useful skill for reasoning about problem boundaries. All too often, our performance estimates are on the order of "fast" and "crazy fast". Worse yet, we build systems without understanding how they will scale under load.

> A well-known curve of a calculation’s accuracy versus mental effort goes like

    \    __
     \__/  \
Have anyone really measured this bump or it's only folklore / myth?

It's the well known 'this graph is true because it has unexplainable data in it' bump.

The problem is not that it is unexplainable, or yet unexplained. The problem is if it is measurable and if it really exists.

(For example gravity is "unexplained" in some way. You have two masses, they deform the space, and with some approximations you can use the formula F=GmM/R^2. But, for example, why do masses deform the space?!?!?!?!?! But luckily you can measure it, just drop a stone or find a gravity lens, so gravity exists.)

It's certainly true in statistics and machine learning that adding variables with unreliable estimates lowers the whole model's accuracy after some point.

One problem: He assumes that the market is being adequately served, i.e. that supply and demand are relatively equal. What if tuners have a 1 year backlog and they're charging triple what you'd pay in other cities because there aren't nearly enough of them? This one piece of information could drastically affect the accuracy of any response.

Not exactly relevant when all you're concerned with is determining whether or not someone can "estimate the answer without any specialised knowledge". But, then again, why would you only be concerned with something so trivial?

Knowing the correct order of magnitude is seldom trivial. Getting the correct order of magnitude quickly will make you a better decision maker (not to mention impress people whose opinion matters).

An imbalance of supply can increase/decrease the population by maybe as much as 2x (or .5x). An order of magnitude can be the difference between a market that's worth considering and one that's too small to currently be serviceable.

This is important because in business getting a right(ish) answer quickly is often more important than getting a precisely correct answer slowly. While you're measuring precise supply and demand, FermiCo has already concluded that the market is too small and moved on.

p.s. irregardless is a great word.

If there is a 1 year backlog it is a temporary market state. The prices will rise, which will attract competition, which means more tuners, which means elimination of backlog. Free market etc.

I use this type of calculation all the time. It's often applied to a more scientific problem than piano tuners, but the key is round numbers, approximations and simplified equations. It's an incredibly useful process if you have an idea sloshing around but don't know whether it would yield anything. 10 minutes and a few estimations and you know whether it's feasible or not.


I disagree. Being able to quickly understand what kind of range something will be in is a very important skill. I've met far too many engineers and engineering students who have no real concept of magnitude. In an academic setting, that can lead to comically large or small answers. In real life it can lead to lots of wasted time.

Asking someone to talk through a problem like those discussed is often very useful. It gives you an insight into how they think, the background knowledge they can draw on and their understanding of accuracy of data. It's fairly obvious when someone's bullshitting (especially if they talked through the process with you), and any good candidate will be honest about the inaccuracies inherent in their assumption.

I've actually had a professor do this to us in a class. It is a very effective way to show that you can understand what order of magnitude to expect from an answer.

This means that when I go and plug the orbit of a satellite into STK, I'm already looking for an answer within a certain level of precision. While this isn't good enough for an answer to "how much stress does the bolt that holds down the payload have to be capable of withstanding" it does give you a sense of what that answer should be.

This is a very useful tool for realizing when your equation is returning complete crap because you are using the wrong unit (as pointed out by tolkenadult and the Mars probe example).

I had a math(s) teacher when I was about 11, who would use the phrase "sit on your protractor" to talk about getting inside a geometry problem and asking yourself "does my solution roughly make sense".

I still think in those terms, asking whether the answer to a problem, whether a SWAG, estimate, or careful calculation, could actually be correct

I've asked out-of-town job candidates "How many gas stations do you think are in this town?" but I've not had much success with that. I think for engineers it's vital to be able to do order-of-magnitude estimations and I'm trying to get my kids to learn that but it doesn't seem to come naturally to people.

I think it has to do with how you phrase the question. People go to job interviews knowing that they're going to be tested for their knowledge, and phrasing the question like that triggers a "how am I supposed to know that?" response.

But you're not actually asking them to know it, you're asking them to reason about it. So perhaps instead of asking "How many gas stations do you think are in this town?" try asking "Could you give me a reasonable estimate of the amount of gas stations in this town?".

I think a lot of people just lock up when they're asked something they don't know in a synthetic situation even when in a real situation they would actually make very good judgements since in any real situation the context is often so much more clear, unless you like to keep your employees in the dark..

> I think a lot of people just lock up when they're asked something they don't know in a synthetic situation even when in a real situation they would actually make very good judgements since in any real situation the context is often so much more clear, unless you like to keep your employees in the dark.

I've had that mental block happen, not when asked to estimate something like balloons in a gym, but when asked to estimate something I knew for a fact was a single search query away.

I was asked, "So how many travelers or trips in the US each year?"

Googling "how many travelers in us" gives http://www.ustravel.org/news/press-kit/travel-facts-and-stat... as the top result.

Knowing this is that kind of data, I was unable to put myself in a frame of mind to make believe reasoning through it. The interviewer told me to just pretend we were white boarding a marketing plan for a travel tool. I pointed out that in such white boarding or brainstorming conversations, someone would look that data point up because it's obvious a lookup would be more quickly available and more accurate than running through a Fermi style reasoning chain.

Partly because the best developers are "lazy", morally opposed to reinvention of wheels, your point about performance in a synthetic situation is dead on.

I think the problem with a question like "How many gas stations do you think are in this town" is it is too simple to reach the answer, and the approximations are too vague. The solution for everyone is to google it. The approximation method requires some big assumptions about how many gas stations there are per-capita, or per-car, or per-area. I don't think anyone would be able to easily approach that off the top of their head and expect to get a reasonable answer. Just as approximation is a useful tool, knowing when it doesn't work is also useful.

What would your approach be to work out how many gas stations are in your town? Have you done that calculation, and come out with a reasonable answer? What assumptions did you have to make?

How often do you really need these kinds of caluclations? And if you use them wouldn't it be faster / more precise in most cases to just google it?

Honest question.

The problem I see is that for most of these you at least need SOME information (e.g. the number of inhabitants of NYC) to start from.

To get that initial information you'll most likely use a search engine anyway, so why not just invest a little more effort to find a exact number?

These questions are never about the answer, but about how someone would go about getting the answer.

Some people freeze. Some of those people stay frozen even after a bit of discussion. Some people just cannot start the process of answering a question like this.

Luckily, most places have stopped asking questions like this because they'r ejust not very good for whatever those people are recruiting for.

I actually did this the other day to show a friend that a newspaper had greatly exaggerated the number of prostitutes in Spain (300.000).

There are 40M persons in Spain, 50% are women, 50% of women have an age suitable for the trade, it is impossible that 3%, or 1 in 30 are prostitutes.

If only all requirements were up front and found in the initial discourse.

Never thought i'd see my first year eng prof's lecture on HN. :)

This lecture note is credited to Professor Ben Quine, a popular and favourite eng prof at York University, Canada.

Really, I had him a couple years ago. I'm in CompEng, what field are you in, we might know each other ;).

We do know each other. Very well :0

I was taught by the engineer who created those slides. Pretty interesting guy. Has Professor Quine okayed the slides being posted?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact