Hacker News new | past | comments | ask | show | jobs | submit login
What has a 1 in a million chance? (2010) (stat.berkeley.edu)
269 points by ksec 9 months ago | hide | past | favorite | 187 comments



> "Scientists have calculated that the chances of something so patently absurd actually existing are millions to one. But magicians have calculated that million-to-one chances crop up nine times out of ten."

-- Terry Pratchett

Making fun of the fact that if someone says "it's a million-to-one chance, but it might just work!" in fiction, you know it's going to work.

In _Guards! Guards!_ this is taken to the point that they reckon that it's not enough to hit the dragon with the arrow at the soft spot, they also have to try a whole bunch of improbable circumstances to get the chance to exactly 1 in a million. Because exactly 1 in a million is hard to achieve, as the article shows.


The cheese-loving aviators have the Swiss-cheese-model to visualize how small unimportant errors can stack up to a catastrophic outcome. Though in my opinion there's a flaw in this thinking as no sane cheese seller would shuffle their layers of cheese after cutting them: Hence 1-in-million chances happen all the time. https://www.aviationfile.com/swiss-cheese-model/


You'd think that but the Toledo fondue flood of 1973 says otherwise.

I can't find the wiki link, but Ronald O'Sullivan, who'd just taken over his father's cheese shop, was attempting to make a single block of swiss after a long week of selling the first 80% of several blocks to different restaurants. He took those cheese tails and stacked them up, not realizing the dangers he was putting everyone else in. It was later found that there were several other contributing factors; he hadn't used a properly certified cheese cutting blade (baloney!) and had reused the wax liner to store the off-cuts prior to reassembling the cheese that ultimately failed.

Some say you can still smell the swiss on hot august days.


Need a source on this, please. It sounds real at first look, but I cant find any links and it's very similar to the boston molasses flood of 1918 story, especially that last bit -

"The event entered local folklore and residents reported for decades afterwards that the area still smelled of molasses on hot summer days."

https://en.m.wikipedia.org/wiki/Great_Molasses_Flood


Fondue can't cause a flood like the Great Molasses Flood because it's not capable of truly fully laminar flows.

Fondue is a colloidal suspension of cheese solids in a mixture of wine and melting agents and it shows complex rheological behavior. Unlike Newtonian fluids, where viscosity remains constant regardless of the applied shear rate, fondue demonstrates non-Newtonian features, especially shear-thinning. As the shear rate increases, its viscosity decreases, a phenomenon attributed to the alignment of colloidal particles in the direction of flow, interfering with the fluid and preventing it from filling every nook and cranny like a flood would.

Fondue's inability to achieve truly laminar flow is rooted in its viscoelastic properties. When subjected to stress, fondue exhibits both viscous and elastic characteristics, a behavior modeled by the Maxwell model in rheology. This model combines a viscous damper and an elastic spring to describe the material's response to applied stress. As fondue is subjected to shear stress, its structure becomes disrupted, leading to a breakdown in its ability to maintain a cohesive flow front. This disruption manifests as a turbulent flow, preventing the formation of a flood-like scenario similar to that caused by molasses.


It's 100% made up. It's an attempt to make fun of people who would consider it unlikely that anyone would ever stack cheese as a way to mitigate safety risks.


'I can't find the wiki link" was a good clue. I'll have to remember that line next time I'm kidding someone!


I don’t know if I’ve ever seen it investigated as a mathematical puzzle but there’s probably some interesting number stuff to be found in investigating the mathematics of an idealized swiss cheese risk model.

If you take a series of unit squares and from each one remove a random circle (ah, the origin of every protracted probability argument among mathematicians: a random circle selected from which distribution?), then stack them up… after n squares, what is the expected hole size? Or how many squares do you need to stack to reduce the hole size to below a particular threshold?

Circle intersection geometry is hard though. Probably easier to start with axis aligned square holes, which are what you get when you make your Swiss cheese out of milk from spherical cows.


No, you get axis-aligned square holes when you get your milk from cubical cows.


Cows are spherical


Cube is sphere in L_{\inf}.


I am a pilot, and in aviation related topics pilots tend to be almost ridiculously pedantic about stuff.

I want to congratulate you on writing the most pedantic thing I've ever read in my life. Truly a masterpiece. I can't wait to bust this out in my next aviation related discussion.


I'm glad I'm not the only one bothered by the implied slice shuffling in this "model". Lol! We need to think of another real world setting where a stack of randomly hole-filled sheets would be naturally shuffled.


Punch cards? Might be a few years out of date to be readily imagined by an average member of the public.



I was thinking of loading the dice in some CRPG so thar if some event has exactly 1/1,000,000 chance then resolve it as if it was 9/10.


They had a point there, even if not realising it, as that world's rules conform not to regular physics but the general population's beliefs.


Million-to-one chances are a running theme in a few of his books, but aiming specifically for it as in Guards, Guards! is one of the better awesome things in the Discworld


It seems like a pretty direct homage to the Infinite Improbability Drive from Hitchhiker's Guide to the Galaxy, which is created from thin air by working out exactly how improbable such a device is and plugging this number into a finite improbability generator (such machines themselves "often used to break the ice at parties by making all the molecules in the hostess's undergarments leap simultaneously one foot to the left, in accordance with the theory of indeterminacy.").


TFA: "If you tossed the coins [20 tails in a row] then the first answer would be NO, unless I'm very confident you lack the ability to fool me … "


Yeah, what does this even mean? Isn't coin flipping a random process?


I read it as meaning that there's a much better chance that you're able to perform a trick, so the chance is the probability of filpping them for real, plus the chance of you being able to perform a trick to make it seem that you have.


In general no, it is a chaotic system, and humans are just bad at consistency.

https://www.npr.org/2004/02/24/1697475/the-not-so-random-coi...


Anecdotally, with practice, some people can flip a coin to a desired outcome like 65% of the time. And .65^20 is only around 1 in 10,000.


You flip the coin, catch it, then present it on the back of your other hand.

With a bit of training you can feel the coin with your thumb when catching it to make sure you present the desired side. If you do it quickly enough people won't see you do it. The trick requires coins that have sides that feel distinctive.


GNU Terry!


I reread Eric this weekend, somewhat on a whim. Real pleasure. I also had totally forgotten he came up with the premise of the Good Place. He had such so many ideas.


Reminds me of https://markxu.com/strong-evidence

«One time, someone asked me what my name was. I said, “Mark Xu.” Afterward, they probably believed my name was “Mark Xu.” I’m guessing they would have happily accepted a bet at 20:1 odds that my driver’s license would say “Mark Xu” on it.

The prior odds that someone’s name is “Mark Xu” are generously 1:1,000,000. Posterior odds of 20:1 implies that the odds ratio of me saying “Mark Xu” is 20,000,000:1, or roughly 24 bits of evidence. That’s a lot of evidence.»

«Extraordinary claims require extraordinary evidence, but extraordinary evidence might be more common than you think.»


Someone offering that bet though would also alter the odds though. Because why would someone offer such a ridiculous bet after you've told your name, unless they had special information about your name? Although it being a possible YouTube prank channel would make it also possible that you actually have a chance to win for the reactions.


This is part of a very important principle: if someone approaches you about something, you should be far more suspicious than if you had randomly picked someone off the street.

The most well known application of this is teaching kids to ask a random shopkeeper for help if they get lost, but to not get in a stranger's car when offered a lift.


> One of these days in your travels, a guy is going to show you a brand-new deck of cards on which the seal is not yet broken. Then this guy is going to offer to bet you that he can make the jack of spades jump out of this brand-new deck of cards and squirt cider in your ear. But, son, do not accept this bet, because as sure as you stand there, you're going to wind up with an ear full of cider.

(Guys and Dolls)


This also applies to self defense. If someone has targeted you, the odds of you beating them in a fight is different than the odds you can take on a random person, or a random thief even


Better or worse odds for you?


Also good advice for dating!

If an attractive lady walks up to me at a party, first thing I'm doing is asking for references.


This must come up a lot.


About as often as you'd expect.


My dude…


I would put the odds of Mark Xu’s license having that name on it at about 15%. Because Mark is probably an anglicised asian name.


Not to mention that many Chinese immigrants will give their children European names.


In that case the license will match?

The question is are you using the name on your drivers license, which is probably the one on your birth certificate.


The OP covers this a bit as well:

> 20 coin tosses (by me) all coming up Tails. YES

> If you tossed the coins then the first answer would be NO, unless I'm very confident you lack the ability to fool me …


This article seems to twist the definition of “extraordinary” to something clearly not intended by the original quote about claims and evidence.

Mark is a very common first name. Xu is a Chinese surname shared by over ten million people (according to Wikipedia). It’s entirely ordinary that someone would have this combination of names.


Someone, yes. The person in front of you? Unlikely.

Analogy: someone has a winning lottery ticket. Is it the one in your hand? Probably not.


That’s a good analogy for my point, because just having a lottery ticket isn’t extraordinary even though all of the tickets are unique.

If somebody claims they are holding a winning lottery ticket and will sell it to me for a 50% discount because they are leaving the country in two hours and don’t have time to cash it out, that’s an extraordinary claim and I would need extraordinary evidence to take the deal.

If someone says their name is Mark Wu, it’s like saying they are holding a non-winning lottery ticket with serial number 12345654321. It’s at best a curiosity.


Don't ask me why, but I had to urge to look up the number of Chinese surnames: About 2000 are currently in use, half the Han Chinese population uses just 19 surnames. Quite low compared to the 850 THOUSAND (another source has more than a million) surnames used in Germany, twice as much as in Spain bespite it have a tradition of double-names. I guess Germany being such a central state with many migration waves over the last 600~800 years since family names have become common.


It's even more concentrated in Vietnam. A very large chunk of the country is called Nguyen. Korea has similar with Kim/Park/Lee.


Why is the 20 multiplied by the million? Why not 0.95 multiplied by million?

Using the same argument I would accept infinite odds that my username is quickthrower2 so there is infinite information?


It's important to understand that when they said "bits" they didn't mean information in the Shannon entropy sense, but rather in the log-odds evidence sense.

Gaining a Shannon entropy bit means learning the answer to a yes-no question that had 1:1 odds.

Gaining a log-odds evidence bit means doubling your best-guess odds on a question you are uncertain about, from X:Y to (2X):Y.

One Shannon bit is worth arbitrarily many evidence bits, because a Shannon bit takes you from 1:1 odds to UNBOUNDEDLYHUGE:1 odds. So... yeah, actually, reading your username is worth infinite bits of log-odds evidence on what your username is! (Ignoring practical issues like the small chance of computer malfunctions, of course.)

And to answer your initial question: the 20 just came from the assertion they'd bet 20:1. That was arbitrary.


This isn't how the mathematics of odds works, as the GP correctly pointed out. An event which is 20:1 on is not 20 times more likely than certainty.

Going from 1:4 to 1:2 means that the event has become twice as likely. But going from 2:1 to 4:1 does not: it means that the complementary event has become half as likely.

Based on this, we can't do math with odds treating them identically to ratios.

If you do the math correctly, the two types of information measure are basically the same thing.


A log-odds of b bits means an odds of 2^b : 1 which means a probability of p = 2^b / (2^b + 1).

In the original comment, the evidence update was stated as going from 20:1 to 1:1000000 and it was claimed this was approximately 24 bits of evidence. The update is from 2^4.3:1 to 2^-19.9:1. Subtracting the exponents you get 4.3 - -19.9 = 24.2 which is approximately 24 as claimed. The "20" in 20:1 is correctly accounted for by the ~4 additional bits of evidence on top of updating from 1:1000000 to 1:1.

Clearly evidence bits behave very differently from entropy bits. Acquiring a single entropy bit is an update from 1:1 to 0:1 which is 2^0:1 to 2^-infinity:1. It's worth an unbounded number of evidence bits. It's important not to mix these two things up.


Yes, you are doing math with odds as though they are fractions or ratios, which is deeply incorrect. 20:1 is not the reciprocal of 1:20 but the complement. Odds ratios and similar calculations do not work like this. You can do this type of calculation using X:1 odds or 1:X odds but not both in the same calculation.

Or perhaps you can provide a reference to a justification of this type of calculation?


Search "decibels" in https://www.yudkowsky.net/rational/bayes for the explanation.

I think you're just wrong about needing everything to be in the form X:1 or 1:X. When I compute the ratio of 1000000:1 divided by 1:20 it gives 1000000:(1/20) then scaling both sides by the same factor gives 20000000:1.


Your reference definitely doesn't show a calculation of the type you are trying to do. Likelihood ratios are not the same as odds ratios; they do not have the problem I described.

I would be very surprised if you can find any reference at all to the number you describe as 'evidence bits', or anything equivalent, made by anyone who can show an understanding of basic probability, statistics, or information theory.

I understand how you get 20,000,000 as the answer to the calculation you carry out. My point is that that number is not meaningful in any way.


When you apply a statistical test, the various outcomes cause Bayesian updates that correspond to adding or subtracting fixed bits of evidence. When you repeat the test (and the repetitions are independent), the amount of bits of evidence you add or subtract remain the same. In other words, focusing on bits of evidence shows Bayesian updates behave like a biased random walk under repetition of a test and allow you to compute the properties of that walk.

For example, suppose you are trying to estimate how much rounding errors in a pseudo random number generator betray that it is not a true exact representation of the random process. One way to quantify this is to compute the expected bits of evidence revealed per call to the RNG.


There's something so wrong with that logic. Imagine it as:

Mark: "I've thought of a number between one and a billion and written it down. The number I wrote down was 6,822,172"

Recursing (you): "ok"

Mark: "ohh you believe me? Then let's take a bet, if I did write down 6,822,172 then you win. If I wrote down any other number between one and a billion then I win".

Would you take that bet? I think you'd be suspicious.

Mark: "I predict that Recursing would take that bet, happily, because why wouldn't they believe me? Therefore them taking the bet is very strong evidence that I actually did write down 6,822,172".

You can't use your prediction about someone else's behaviour as evidence!


24 bits seems about right for the information content of six Latin characters arranged in a pronounceable English orthography (the ‘X’ has pretty high information value though).


The following are all ~ 1 in a million chance of death, or 1 micromort:

Travelling 6 miles (9.7 km) by motorcycle

Travelling 17 miles (27 km) by walking

Travelling 20 miles (32 km) by bicycle

Travelling 230 miles (370 km) by car

Travelling 1,000 miles (1,600 km) by jet

Travelling 6,000 miles (9,656 km) by train

But switching from car to bicycle for short trips still increases life expectancy due to health effects.

Sources: https://en.wikipedia.org/wiki/Micromort https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2920084/


Chicago to LA is about 2,000 miles.

So in the article about 200 miles driving (in California) is 1 in a million chance of dying. So lets use that number nationwide to be lazy.

Now we can move a decimal point over. So the death chances of a Chicago to LA drive is 100,000 in one. But you drive back, so then its that twice. Once in 50,000 people dying on a Chicago to LA and back roadtrip is extremely frightening. How many people from the midwest make this drive a year? Or from the east coast? How many don't make it back?

The USA, on average, has 100+ fatalities via auto transportation a day.

The above ignores serious injury, permanent disability, etc. Its just death. The chances of having to deal with a broken spine, losing a limb, blindness, 3rd degree burns all over your body, etc aren't even calculated, but those are real and far more common than death. Death being harder to achieve with modern medical treatments.

Cars are extremely dangerous. We downplay what it means to drive.


> We downplay what it means to drive

I wondered about that reading some of the comments about the 737 Max. We routinely travel in exponentially more risky ways all the time, yet we expend time thinking about the safety benefit of avoiding a specific type of aircraft.

Not downplaying airline safety as a whole there, but thinking about it for yourself on a personal level is maybe not moving the needle.


Agreed. It must’ve been terrible for the people on that plane, but I don’t think it’s worth my time to worry about what plane I’m flying on, or what the maintenance record of the airline is. For developed countries, they’re super good enough.


As a cyclist that’s trying to not die, I also have to assume this heavily depends on where you’re cycling. Along a stroad? You’re dead. Country road with more bikes than cars? Much safer.


Also you can rack up miles on the interstate like no one's business on a road trip etc. imo it should be by unit time not unit distance but I could see the argument for distance

Same with planes. Like yeah if you wanted to make the same trip without a plane it's way more dangerous but what if you just wouldn't take the train from coast to coast in the first place?

IDK, imo time is the most important tradeoff for transport when doing per capital/etc measurements


"Not in a million years"–something that might happen once in a million years for an individual–happens over 8000 times per year for people on earth. Many of them have smartphones with camera to record the event. That's why we have an actual video where quick brown fox jumps over the lazy dog.


My chance of dying because of a dinosaur-estinction-event-class asteroid hit is one in a 65 million years, thus every two and a half days on average a random person on earth dies being hit by an asteroid of that size.


This is quite a good point. People often fail to make the distinction between something that happens once every million years to all individual organisms separately versus something that happens once ever million years to the planet itself. Our language is usually ambiguous about this.


This implies that your death and all others are not correlated. I’d argue that if an asteroid hits the death correlation is almost 1.


I think that's what he's trying to point out


I wanted to say the same thing but I attempted to use an intuition pump


Similarly, Vatican City has slightly less than 6 popes per square mile.


Had to search for that video, I love things like this. I was not disappointed, so thanks for that. Really tickled me. Doesn't displace my favourite though: https://www.thewestmorlandgazette.co.uk/news/6922546.bull-br...


Well, a wave hit it.

A wave hit it?

A wave hit the ship.

Is that unusual?

Oh yeah, at sea? Chance in a million!

https://www.youtube.com/watch?v=3m5qxZm_JqM


I absolutely love the “Becker Bottle”. It gives you the ability to truly visualize this concept. Super fun to play with and a great learning aid for chemistry classes.

https://www.flinnsci.com/becker-bottle-one-in-a-million/ap45...


Wonderful concept. Thanks for bringing to my attention


It may be nice to know the safety factors used for structural engineering of homes, offices and other regular buildings in the EU.

The Eurocode defines 3 consequence classes: CC1, CC2 and CC3. CC1 has the lowest consequence and is used for regular homes, light industry and agriculture. The chance of dying as a result of structural failure is low, 0.001. The chance for a CC2 building (apartment buildings, offices, hotels etc.) is defined as moderate with 0.03. And CC3 is for special buildings, such as large stadiums, with a high risk of death on structural failure, 0.3. There are other factors that go in defining a consequence class however, including economic and social concerns.

The consequence class maps to the chance that we find it acceptable for a building to collapse in a given year. Causes can be anything, like extreme weather. For CC1 this is 1 in 100, for CC2 1 in 10.000, for CC3 a chance of 1 in 100.000.

So the chance one or more people die in a stadium during a heavy storm due to structural failure could be 1 in 300.000 in a year if you purely look at the statistics behind the structural safety standard.

The statistics map to simple reference values for the loads of wind, snow, rain, usage etc. and easy safety factors. For example CC2 has a safety factor of 1,5 over all variable loads.


> The chance for a CC2 building (apartment buildings, offices, hotels etc.) is defined as moderate with 0.03

Does this mean 3% or 0.03%?


If a CC2 building collapses, the expectation is that in only about 3% of the cases this leads to someone dying. I don't know the complete reasoning, but can imagine some factors of why this number is far below 100%: a building is not always in use, there are often warning signs with time to escape, and collapse can be localized (not the whole building).


Makes sense, so this is confounded by the number of people in the building.

30% for a CC3 seemed high to me initially (hence wondering if 0.3 really meant 0.3%). But since it actually means "in 30% of structural failures in CC3 buildings (e.g. a stadium), at least one person dies", it make much more sense because there are probably lots of people in the stadium.


I agree, it would likely have extremely diminishing returns in terms of lives saved to have more stringent safety requirements. Needs to be a balance after all.


It was fun to see this near-adjacent to the submission below. I wonder if one answers the other!

> Q > What has a 1 in a million chance

> A > https://news.ycombinator.com/item?id=38907620


Reminds me of working on a service with a huge amount of traffic

What I found interesting was how often edge cases would occur with that much traffic

Something that would be really difficult to reproduce locally would happen like 100 times a week if you looked at the logs


This is very cool, you have to try to pick a random sequence of 1s and 0s and then shows you how not random it is.

https://calmcode.io/blog/inverse-turing-test.html


I passed everything up to and including the three digit sequences on my first try. 10 fail, 110 pass. I guess I'm random enough?


Did you ever consider becoming a software library when you grow up?

`from tasuki import random`


I think getting killed by a meteorite is in that order of magnitude.

But it is a bit of a tricky question. Because you can get hit by a small space rock, it has already happened but it is extremely rare, much less than 1 in a million, and I don't know if there is a record of anyone dying from it. But there is also a small chance of a massive impact killing billions in our lifetime, and intermediate scale events like if the Tunguska event happened in a populated area.


There are records of people killed by meteorites, literally in single digits.

A simple statistical test would give you an idea. With 10B people on Earth, a "one in a million per lifetime chance" would happen to 10M people during their lifetimes. If we optimistically assume that a lifetime is 100 years, and the chances do not change with age, the event would affect 100k people every year.


Your calculation is wrong, there are a thousand millions to the billion. If there was a 1/1,000,000 chance across 10 billion people, it would be 10,000 people affected.


You are correct, and I am not!

Then the "one in a million per lifetime" chance would be an event that happens to about a hundred people every year, on average.

Winning a big lottery is within the right ballpark. Flying to space is definitely more rare.


hmm, looks like someone forgot how many millions are in a billion. Those millennials.


I used to tell people I was an order of magnitude more likely to die from a traffic related accident driving to work than from any hostile action during the two years I was in Afghanistan. People seemed challenged by that.


You weren't joking, and actually it's much more than a magnitude.

The chances of dying in a traffic accident in US are between 0.9 and 1.2%, whereas the mortality rate of US military servicemen in Afghanistan has been below 0.004%.

That's 3 order of magnitudes.


1.2% is over the lifetime. Don't know what the 0.0004% figure means, but in the active war phase coalition forces sustained 1-2%/year fatalities rate in Afghanistan https://academic.oup.com/ije/article/36/4/841/670068?login=f....


Probably because they are different time frames. Your chances of dying in Afghanistan are over say a 4 year period. When someone says you have a 1% chance of dying in a car accident, that's spread over a lifetime. The chance of someone dying in a car accident in that 4 year period is much lower.


Reminds me of Jared Diamond explaining why he might not have taken a shower:

https://archive.ph/fNeaH

HN comments: https://news.ycombinator.com/item?id=5145268


This often remains true even over smaller scales. For instance, between 2009 to 2012 I was in a combined arms battalion that deployed to Iraq twice and we experienced three deaths in combat and nine in traffic accidents near Fort Hood during those years. Another drowned in a lake near post. Also, one death due to suicide in theater.


Winning rock-paper-scissors enough times against cgpgray: https://m.youtube.com/watch?v=PmWQmZXYd74


My favorite mental visualization I came up with: imagine driving from New Jersey to Florida (substitute with a long drive you had). Mine was about 1,200 miles which at 60 miles per hour is 20 hours. That's 72,000 seconds of mind-numbing driving.

Now if during the entire, exhausting, 20 hours of driving, you press a button, and it falls within the "danger zone" that lasts 15 seconds, you lose.

The above example is better when looking at lottery winning chances (worse than 1 in million) - where you can imagine having to throw a quarter out of the window, hoping for it to land within the proper 1 inch section of the road.

I like this example because it gives you visceral feeling - allowing you to think in terms of lengths of road or length of time to compare various odds.


> imagine having to throw a quarter out of the window, hoping for it to land within the proper 1 inch section of the road

This idea also helps to illustrate something which is intuitive to me but seemingly hard to explain. Sometimes when an event occurs people will exclaim, "Wow, what are the odds of that?!" and then if it turns out to be a seemingly unlikely event, then it becomes "wow, it's crazy that happened!".

But it's really not. Using the pull-quote as an example, it's the difference between 1) throwing the quarter out the window and drawing a circle around where it lands and 2) drawing a circle on the ground, hoping to land the quarter in it when it's thrown. Of course the exact result in the first case was unlikely but nobody predicted it so it's uninteresting.


It's because they think "that" is "meeting Dave from my previous job on holiday in Turin!" but it's actually "meeting one of the hundreds of acquaintances you've picked up over your life at any point when you're away from home at any time".


15 seconds danger zone / 72,000 seconds of driving = 1 in 4,800

Any reason you chose those numbers? Trying to understand the significance of those odds.


So sorry -- I think I got distracted while writing (doing things around the house) and got the math wrong :'(

I wanted something around 1:1million odds


Also curious! But even that does help to appreciate what a 1:4800 chance something is. I really like that frame of reference.


Reminds me of “one in a million is next Tuesday”:

https://jeffgarretson.blog/2014/07/17/one-in-a-million-is-ne...

Worth a read if you are an SRE.


I like the formulation, “at 100 rps one in a million is 8.6 times each day.”


It's all availability heuristic. If you can think of an example, you think it's more likely.

“Hundreds of children die every year in drowning accidents,” he says. “We need lifeguards at pools more than armed guards at schools.”

https://www.city-journal.org/article/sorrow-and-precaution-n...

https://news.gallup.com/poll/266681/nearly-half-fear-victim-...


“Scientists have calculated that the chances of something so patently absurd actually existing are millions to one. But magicians have calculated that million-to-one chances crop up nine times out of ten.” ― Terry Pratchett, Mort


I wonder if it would be worth it to express unlikely events like this in powers of Dunbar’s number? Round it down to 100 to make the math easier, and also because there’s a lot of overlap in social networks. One million is 100^3, so in your acquaintances-of-acquaintances-of-acquaintances network there’s an expected value of one person who has to deal with this one in a million problem, I think, right?


Only if there is no overlap whatsoever in who is an acquaintance versus who is an acquaintance of an acquaintance. That is extremely unlikely. I'm sure there exists some data out there on how quickly an average social graph grows as a function of degrees of separation that can be used, however.


From a sibling submission in front page: https://en.wikipedia.org/wiki/Micromort

"Micromort" is a unit of risk defined as a one-in-a-million chance of death, […] used to measure the riskiness of various day-to-day activities.



An interesting read. I was secretly hoping it would have delved more into the psychology of how we perceive that chance, what kind of biases we have. I feel we have a tendency of saying "it never fails" on probabilities which turn out to be much bigger than one in a million.


The last one -- a young man's 1 in 1000 lifetime chance of getting breast cancer -- is actually quite curious: it's not that much lower than his chance of getting testicular cancer, which apparently has an approximately 1 in 250 lifetime chance (I assume in the US). [1]

And apparently a man is about equally likely to die from breast cancer as he is from testicular cancer: both have a lifetime chance of about 1 in 5000.

[1] https://www.cancer.org/cancer/types/testicular-cancer/about/...


Cosmo Kramer : Have you ever met a proctologist? Well, they usually have a very good sense of humor. You meet a proctologist at a party, don't walk away. Plant yourself there, because you will hear the funniest stories you've ever heard. See, no one wants to admit to them that they stuck something up there. Never! It's always an accident. Every proctologist story ends in the same way: "It was a million to one shot, Doc. Million to one!"

https://www.imdb.com/title/tt0697702/characters/nm0000632


Picking a random integer between 1 and 1 million.


That depends on the quality of your randomness. If it's a person picking the number that definitely isn't a 1 in a million chance for most numbers. You need to get pretty specialized hardware before you can confidently say it's definitely an equal chance for every number every time.


No, you don’t need specialized hardware. The output of a modern hash for crypto absolutely needs to produce a uniform distribution across a million.


Technically, it's not actually uniform, or "random" - it only has to be indistinguishable from random, e.g. if you had access to a true random source and a pseudo-random one, it should be computationally infeasible to distinguish them.


Given a finite amount of sample.values, you can always find a polynomial to match the values.

You can only proclaim that something is "truly random" on epistemological grounds, assuming that true randomness is somehow exposed by the universe. This holds under quantum physics, but not under classical mechanics. Unless there is some quantum effect, RNGs based on fluid dynamics, like lava lamps, may be completely deterministic, we just don't know how to set their initial conditions precisely enough.to reproduce a phase trajectory.


You're right that we can never find conclusive empirical evidence for something being truly random, but we can say with confidence that a PRNG is not random because we can look at the algorithm.


Your polynomial has no predictive value in almost all cases.


Yes. Finding out the nature of a PRNG and its initial state would allow one to predict next values, at least theoretically, that is, if it can be computed faster than the values are generated by the process we're looking at.

But the presence or absence of such a function for a naturally occurring process is, again, an epistemological question. That is, whether we can reverse-engineer the universe deeply enough. Unless we find its source code, we're stuck with retrofitting some formulas, and a "random process" is such for which we can't retrofit any better description than a statistical one.


But how do you pick the input for the hash?


You need a seed, and a previous hash. The seed is usually sourced from another random generator, that uses environmental noise and some (salted) hashing as well.

In practice, a precise clock and a few difficult to guess events like keyboard and mouse inputs are enough to get a descent seed.


Current date and time.


Just bang away at the full keyboard for 50-100 characters?


Do you mean, after picking a million numbers, then every number in the range should have been picked exactly once?


Definitely not this. That wouldn't be random at all. For example, after picking 999,999 numbers you would be able to predict the next number with certainty. What it means is that every time you select a number, every number in the range has an equal probability of being selected.


Which means that there must be more state in the generator than is output as a random number; otherwise each time number X was produced then number Y would follow it. (Perhaps obvious, but I like stating the obvious.)


No, each number has a 1 in a 1,000,000 chance each time the generator is used.


That's not a uniform distribution. Probability doesn't remember previous state, so the chance of picking the number 241 out of 1 million numbers remains exactly the same after picking 241 once. In particular, the chance of getting no duplicate numbers if you pick a million times, is very close to zero.


It still is a 1 in a million chance if you have no a priori knowledge about the way the random numbers are distributed.


It would be interesting what kind of clustering (if any) there would be if 1 million people were asked to pick a number between 1 and a million.



I think true randomness was implied in GP’s comment. A person picking a number “randomly” from a million is not random in the mathematical sense.


There's no evidence of true randomness though. There's only evidence of us having missing information to do the prediction of what any RNG would offer us next.


>If it's a person picking the number that definitely isn't a 1 in a million chance for most numbers.

Only if the person isn't aware of the issues involved.


This is the right answer. I estimate no more than 1 human in a million can choose such a number at random.

Therefore if a human picks a number between 1 and 1 million, there's only a 1 in a million chance that the number was picked randomly.


Nope, picking a random integer between zero and one million and one would be though.


I am seeing this same/similar comment on this thread which has been posted by three quite new accounts. Is this quite normal on HN now or is it some bots farming engagement? Btw, I'm a new member here, knew about HN since a long time but never bothered to make an account up until now


>Is this quite normal on HN now or is it some bots farming engagement?

As you can see, my account is not new. I thought the comment will be funny. Not every comment should be terribly ingenious, long thought and crafted to show people how smart the commenter is. As a commenter you have the right to be spontaneous.


> As you can see, my account is not new. I thought the comment will be funny.

Yeah I did find it funny, no doubt. Was just asking of general culture here


Looks like one old-ish account with a number of submissions and comments, and two very-similarly-named copycats that are getting voted down to oblivion. Not very common here, but it happens.


One of them seems to have deleted the comment just as I commented about this. Also downvoting is an option? It doesn't seem to be visible here.


Downvoting is only available to users above a certain karma threshold, I believe around 500 or so.


Found a fun CgpGrey video on 1 in a million experience. [here](https://youtu.be/PmWQmZXYd74?si=BDBkrfxEXgAgeRfw)


Is there a word for what he's doing with his voice in this video? Vsauce does the same thing. It's a strange cadence that would make me lift an eyebrow if someone was speaking to me like that in real life. It's like if captain Kirk removed emotion from his voice and began to lecture you.


The Linguistics of 'YouTube Voice'

https://news.ycombinator.com/item?id=10693664

From 2015, so the style might have changed

https://news.ycombinator.com/item?id=10695754


Wendover Productions is another major offender.


From the article: “24 million babies will someday be President.” This seems an improbable claim. Is he saying something sensible by virtue of the surrounding analysis that I’m just not following?


The parsing is ambiguous, read it with parentheses as follows: 1 in (6 times 4.0 million = 24 million) babies will someday be President


Yeah, I think it's a typo and should have been "one in every 24 million babies" - this makes sense given the numbers, with 4 million born every year but only a new president every 6 years.


This was a nice read. Takes me back to Probability 101, which was a fairly eye opening experience, not only because of you get to learn the basis of a lot of research (sampling, deviation, etc) but also because of the many counter intuitive behaviors of actual randomness. It was a little bit like learning the physics of math, if that makes sense.


i immediately thought of the 2^20th examples. I often marvel at the 2^10th is 1000ish coincidence

but people say things like "1 in a million" because it sounds incalculably rare, so I'm not sure I understand the goal here? to make an exaggeration sit on an intuitive scale as if it was meant literally? if successful that will make being "literally one in a million" fall out of favor just as literally has.

one thing I took away from modestly extensive study of linguistics was, people need to stop thinking that words have strict definitions, and realize that words adopt definitions to suit what people are trying to say. What did they mean is more important than what did they say.


Lloyd Christmas and Mary Swanson.


One of my favorite GPT-4 mistakes mentioned

Struck by a Meteorite: The odds of a person being struck by a meteorite are estimated to be about one in 1.6 million.

I wonder how plausible things like this seem to people…


since we're having a bit of fun:

Mary: "I'd say.. more like one out of a million." Lloyd: (slowly reacts) "So you're telling me there's a chance? … yeah!!"

Dumb and Dumber (1994)


> As I tell students, your grandmother is too sensible to be outdoors during a thunderstorm and a disproportional number of deaths are young men.

reminds me of https://xkcd.com/795/


Your grandmother very probably doesn't work in a rail gang.

    ELEVEN members of a railway maintenance crew had to be taken to hospital on Saturday after lightning struck train tracks they were working on in WA’s Goldfields-Esperance region.
https://www.news.com.au/technology/environment/railway-maint...


On government insurance you have 1 in a million+ chance to be assigned an unbiased, compassionate to pain, not anti-engineer psychiatrist.


Anything with a 50/50 accomplished 20 times in a row is right around 1,000,000 : 1.


Existing is such an unlikely chance that it's more likely we don't.


A lottery with 1 million tickets?

Winning ~3-4 times at the roulette when betting on numbers?


Rolling 100 dice and landing the same number


More like 8 or 9 dice landing on the same number.


Yeah, no. Read the article. There’s an example with a coin flip, which only has two possible values, and therefore a much, much higher possibility than a dice.


Martians coming to Earth and, specifically, to the Woking area near London.


Being Dealt a Royal Flush in Poker. NO but pretty close.

Being Born on February 29th. NO


Royal flush in poker is more like 1 in 6493. ;-)


How did you get there? GP is talking about being dealt it, I think you've calculated the chance of it occurring in a round or something?

Wikipedia has GP's at 649739, so yeah 'almost' a million, roughly speaking. 4 / ((52 nCr 5) - (4 nCr 1)). (Four suits, one way to do it in each suit, deck of 52, five card hand.)


In games where hands are made from 5 out of 7 cards (e.g. texas hold'em), the odds are 30,939:1

I can't think of any common form of poker where it would be ~6000:1


This is a solved problem, otherwise casinos would go bankrupt.


> If we guess a President will serve on average about 6 years, then 1 in 6 times 4.0 million = 24 million babies will someday be President.

This sentence desperately needs parentheses. Took me five minutes to parse correctly.


To save some time for the rest of us:

> If we guess a President will serve on average about 6 years, then [1 in (6 times 4.0 million =) 24 million] babies will someday be President.


I prefer the interpretation of an ominous prediction that the presidency will be filled by 24 million babies someday soon.

Switzerland does have a committee of 7 people as president, only a small step from that to 24 million babies.


Of course it's a very small step to make babies amounting to 3 times the current population of Switzerland (so assuming roughly a 50/50 female/male divide, everyone needs to make 6 babies on average, in the same year so they will still all be babies when the last one is born? that's probably impossible for the females but we are already way into the science-fiction category) and put them all in a committee.

The more I think about this, the funnier and more complicated this gets.


What do you mean, "that's *probably* impossible" to make 6 babies from the same mother in a single year? Takes about 9 months to output a human, you know?


Ha! Nine women can have that baby in one month; management says so.


It’s not impossible to have sextuplets.

It’s probably impossible for a population to average that rate though.


> Takes about 9 months to output a human

Takes about 9 months to output N humans where N approaches to 1 on large averages as even twin births are like 1.2 percent.

Also many females cannot be pregnant again for a while after giving birth, even if we had sped up the pregnancy.

To average 6 babies, we need something like giving birth to triplets every 5 months in average, which improbable even if we had the technology, will and the economics that could support this.

But you know, "improbable" is such a funny word, especially out of context :)


> Takes about 9 months to output a human, you know?

It can also be done months faster getting multiple humans though.


Twins and larger multiples are a thing...


I was once informed that everybody has eaten at least eight spiders in their sleep. Another person in the same car confidently confirmed:

> It's true, if you're on you deathbed and you've only eaten one spider seven more come along and jump right in there at that last second.

I figured it was that sort of reasoning that got us to 24 million babies.


This particular "fact" actually has a known source. Someone made it up as an example of the kinds of things people will believe: https://www.snopes.com/fact-check/swallow-spiders/


The intelligence of current generation of "AI" has been compared to a baby's, so... ;-)


I just assumed they had a mini stroke, I didn't realise there was a way for that to make sense


[flagged]


Why "without replacement"? The question doesn't seem to refer to multiple events. For just one event, it doesn't matter if you do with or without replacement, because there is nothing that can be replaced.


Hash collisions ...


Picking an integer between 1 and 1000000 without replacement.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: