Imagine if (in the future) some item like a phone can detect this information around you, and automatically record it. Forming games ontop of this life data would be weird, neat, fun and sad all at the same time. Imagine seeing a real example of where someone else is just more lucky than you are in stupid but impactful (on your morale) ways.
If it didn't seem so tedious to track, I'd love to implement an app to record this info. Unfortunately no one I know would care, and I'm sure I'd get too lazy to keep it accurate. Neat nonetheless, thanks for the cool thoughts :)
Couples can have any number of sons, and every couple has exactly one daughter. Still, the accepted mathematical solution is an equal gender ratio for the couples' children.
In contrast, "0 sons" is going to describe a full half of all marriages.
So number of expected daughters = 1, number of expected sons = 1. In practice since women can't have an infinite number of children, then this wouldn't be an infinite series, so the real number of expected boys would be lower than one, but there you go…
Now, for the bus case, you get +1 if your bus turns up first, and -1 for every other bus that turns up first. Assume that it is completely random, then:
expected + score is: 1/2 1
expected - score is: 1/2 * -1 + 1/4 * -2 + …
Expected + is 0.5, expected - is -1.
> 1 when your bus turns up, -1 for every bus going the other way.
>E[X]/E[Y] = E[X/Y] is not a valid identity.
is completely irrelevant here because it is being used to point out that a non-answer is wrong.
BTW, on a slightly unrelated point, if there's no timetable, but the interval between buses is maintained reliably, the expected waiting time is uniformly distributed over that interval.
If you have to get a second bus, you need to convolve two of those two uniform distributions to find out the distribution of overall journey times. This is a trapezoidal distribution, which is just about analytically manageable.
But a journey with two transfers (3 buses in total) results in a likely overall time distributed according to a uniform distribution convolved with a trapezoidal distribution, which is a very weird non-smooth shape. You can see why people choose to model distributions with Gaussians, which are well-behaved (convolve two Gaussians, get another Gaussian). The Gaussian just lends itself ideally to recursive applications, hence recursive filtering (e.g. Kalman filters).
I suspect this analysis can be carried out and yield quite good results in the gaussian case (a careful analysis might even yield error bounds on the result).
It's a bit less clear that gaussians should be used when e.g. fitting a coordinate to an astronomical feature, which might not actually be symmetrical.
The other useful property that the gaussian has is its separability, in the 2D case. That is unique to the gaussian and counts for a lot.
It's also possible for the result to be biased because of scheduling. If inbound buses pass every 10 minutes at 16.00, 16.10, 16.20,... and outbound buses at 16.01, 16.11, 16.21, ... you'll usually see an inbound bus first. Though I expect this was not the case here.
In other words, when you're waiting for bitcoin transaction to be confirmed and go to check how long ago the most recent block was produced, in order to estimate how soon the next one will come - you're doing it wrong. Even if previous block was found 9 minutes ago, you're average waiting time for the next block is still 10 minutes.
1. If you pick a block randomly (uniformly), its average length is 10 minutes.
2. If you pick a point t0 in time randomly (uniformly), the average length of the block you're in is 20 mins (and the average length from t0 to next block is 10 mins, and the average length from previous block to t0 is also 10 mins (and, needless to say, 10+10=20...)).
Suppose this is your first flip, one would intuitively think that there is 50% chance of H and 50% T.
Suppose you flipped once and got H. For your second flip, one might intuitively think that since the number of times getting H over the long run is 50% of the total flips, and we already have a flip of H, to "balance it out" the next flip should have a smaller probability of getting H. Instead that is wrong. The next flip still has 50% chance of H.
Suppose further that one has performed N flips, all of them H. One might even think that because of the way the geometric distribution works, it is very unlikely for the next flip to be H again. Instead that is wrong. The next flip still has 50% chance of H.
whether the actual time is 10 minutes or 100 years, knowing that somebody else solved one recently doesn't speed up your time to find one
I'll give an extreme example to make this clearer. Suppose 10X hashpower just came online an hour ago. It's quite likely that ~60 blocks have been found in the last hour, assuming the difficulty adjustment hasn't happened since. Seeing this, one could deduce that hashpower went up by ~10 and that the expected time till next block is roughly 1 minute instead of 10.
Now, in most cases hashpower doesn't change that drastically but it remains true that recent block times give you more than 0 information about hashpower and therefore about the expectation for future block times.
I think it made me completely careless about time, I would just go between stops and take the first one, go with the flow. By experience I'd know the range it would take for me to reach big places around the area.
I had a friend who was completely foreign to this mode of thinking, she was very dilligent and fully trusting (although she mostly used trains so a lot less divergence).
It reminds me of kid studies about intelligence / wealth ratios. When you're environment is random, you think random. When it's predictable you planify.
I've never really understood any example involving a poisson process. They always seem to involve bus arrivals or light bulbs burning out, and I can't understand why the memory less property would ever make any sense for these.
Even if the bus system was poorly run, why would it make sense to assume that the expected value of time to arrival doesn't change based on how long you've been waiting?
What is an actual phenomenon that is well modeled by a poisson process?
This real world example still doesn't perfectly match the theory. For example, if there was no call for a long time, it may indicate that it's some special day or the phone line is malfunctioning or whatever and it could mean that the next call is probably further in the future than the model would say.
Time to next Bitcoin block mined. It's 10 minutes, regardless if whether you've waited 1 hour, 10 minutes, or 10 seconds.
Makes sense though, because all the failed hashes are useless, thus no memory.
Why Is It Taking 20 Minutes to Mine This Bitcoin Block?
Geiger counter clicks.
Arrivals between, say, 2-3 PM at a busy Web site.
Equipment failures for equipment with a constant hazard curve -- if can find such equipment.
Time between road kills on a highway.
Radioactive decay. Collisions of fluid molecules. Unstimulated (i.e. not in a laser) photon emission due to electron transitions in an atom. Lots of pretty memoryless stuff going on at the microscopic level.
I don't think it's saying anything about how long you've been waiting, and you don't know when was the last arrival.
It's saying that if you pick a random point on the timeline, the expected wait time doesn't change. That's because by taking a random point you have more chances of landing in a larger stretch of wait time than in a smaller one.
This is exactly what memorylessness says something about.
Your second paragraph isn't unique to Poisson processes, but the author right at the start says that the expected value of the waiting time is the same as the average interarrival time, which indicates Poisson.
I used it in a cell tissue simulation where the user could define how frequently the cells divide. If you start with 100 cells and want them each to divide, on average, every N iterations, using a Poisson formula to decide if a cell splits or not based on a random number is ideal, very precise (in the aggregate), and avoids a lot of odd artifacts.
The simulations were worth the article on their own. The real world analysis was a great bonus.
Anecdotally, i was expecting confirmation bias to be the main culprit. Pleasantly surprised to seei was wrong.
I don't know how common it is but it does exist. And buses perpetually being early means that if you're on time you wait even longer for the next one.
One reason buses are late is because a bus must travel a circuit. Cars provide linear transportation, so the delay can only happen in the direction of your travel. Since buses run a circuit, they are impacted by delays in the direction opposite of your travel as well.
Your bus might be late because the return route has traffic or other delays. Or maybe a drunk or drug user got in a fight with the driver and the police were needed. Or someone in a wheelchair had a problem getting onto the lift.
Where do you live? I'm guessing the bay area?
I was carless in DC for a year and Google Maps was ALWAYS wrong about when busses when arrive. My friend recommended an app called Transit which was right about 90% of the time, which was a godsend for me.
But it shows stop locations and plans routes quite effectively. Using the 2 apps in tandem creates a workable solution.
Although Google Maps' transit planner is invaluable for finding possible combinations of buses to use, I rely on OneBusAway to tell me which one is actually going to be faster right now.
> just return the empirical expected time it takes for the next bus
There is a world of complexity in "the empirical expected time", there... expected according to what models?
Anecdotally, I think it's especially hard to model because any given delay is probably attributable to one or a few specific incidents. This isn't a situation where everything averages out and we can use a nice tractable AWGN model; we're down in the muck and the shot-noise.
The issue here is the deviation between empirical expected time and actual arrival time. Unhandled exceptions abound.
If you take the average farm, chances are that it's doing humane farming. But if you take the average animal, it has an overwhelming chance of being in an industrial farm.
I'm sure drivers try to actively manage this, but if they didn't I suspect the system would naturally evolve toward pairs of buses leapfrogging each other on long routes.
For example this is an article about a Japanese Train company issuing a public apology for departing 20 seconds early. https://www.bbc.com/news/world-asia-42009839
(From reddit - https://www.reddit.com/r/programming/comments/9s4j58/the_wai... )
So yes, if you want an accurate count of siblings, you would consult some spreadsheet that just lists how many children each family had. If you go and start asking the families themselves (those "in the mix") then your results will be skewed.
I thought the article that this article linked to was also very good, "The Inspection Paradox Is Everywhere," by Allen Downey, http://allendowney.blogspot.com/2015/08/the-inspection-parad...
Nice article, btw, interesting topic!
Not whether other people with different commutes would experience the same, or if its true throughout the day.
Plus, it would incur a sampling bias based on the authors particular commute schedule.
After all, the point of the bus arrivals isn't in service of the bus (or driver) but of the passengers. Observed average wait time at each bus stop is a better measure. The even better measure would be average wait time weighted by number of passengers .
 which is tougher to measure empirically, or even model, than just average wait time for that one person, since it requires counting passengers boarding, not just bus arrival times.
Better to consider each bus stop as an asset to invest in, the more valuable it is, the more people you can serve.
Perhaps you misunderstood my point, which was more about data and statistics, as is the article itself, rather than transportation.
A similar argument could apply to the article's example of "average class size", where that's a valid statistic when observed by a teacher (or facilities manager), but misleading to a potential student. Something like "average size of a freshman's classes" would be more meaningful to a prospective student, and "oversampling" would not be a valid complaint there, either.
> instead of waiting at the bus stop, because of what happened to them yesterday at the bus stop.
It sounds like you're suggesting that there's an even better measure than the two I proposed, rather than the original measure being better. If so, I don't dispute that there could be many more, as I never claimed "best".
In this instance, though, measuring people who never show up to the bus stop in the first place is impossible, and even measuring those who showed up but abandoned waiting (i.e. never boarded) is impossible without additional instruments (whereas, presumably, electronic fare collection equipment could closely enough approximate counting boardings).
Go from arrival to cumulative arrivals to time of arrival to recurrence of arrival (next arrival). All are Poisson processes, including the recurrence process, which has a fixed expected value.
It's just the Poisson process, e.g., with a nice chapter in E. Cinlar, Introduction to Stochastic Processes.
Buses come as arrivals. So bus arrivals are a stochastic arrival process where stochastic just means varying randomly over time where, really, the randomly doesn't mean anything, includes deterministic arrivals, that is, known exactly in advance, but also admits any case of unpredictability.
Well, in short, if have a stochastic arrival process with stationary, independent increments, then the arrival process is a Poisson process and there is a number, usually denoted by lambda, so that the times between arrivals are independent, identically distributed random variables with exponential distribution with arrival parameter, the arrival rate, lambda. The stationary means that the probability distribution of the times between arrival does not change over time. The independent increments means that the time from one arrival to the next is independent of all the past history of arrivals.
The exponential distribution has the property, easy to verify with simple calculus, that the conditional expectation of the arrival time given that the arrival time is already greater than some number is the same as the expected arrival time.
So, net, if bus arrivals form a Poisson process, then the time until the next bus arrives is the same after waiting five minutes as not having waited at all.
Cinlar's treatment is nice because it is qualitative, that is, has assumptions that can often be confirmed or believed just intuitively. And we might not believe that bus arrivals meed the assumptions.
This subject can continue with, say, hazard curves for equipment failures and a lot more about Poisson processes.
E.g., the sum of two independent Poisson processes, say, Red buses and Blue buses, assuming that they are Poisson processes, is also a Poisson process with arrival rate the sum of the Red and Blue arrival rates. If randomly throw away some arrivals, then what is left is also a Poisson process with arrival rate adjusted in the obvious way.
In Feller's volume II is the renewal theorem that the sum of independent arrival processes, Poisson or not, with mild assumptions, converges to a Poisson process as the number of processes summed grows. So, if the users of a sufficiently busy Web site act independently with mild assumptions, then the Web site will see arrivals accurately as a Poisson process.
The vanilla Poisson process is Geiger counter clicks.
There is much more to the pure and applied math and applications of Poisson processes.
This is very ambiguous. Unless he gives a time frame the numbers do not make sense. Average in a week? Average in a year? This is not how it works in real life.
And I cannot accept his premise. My experience tells me that, in New York, when I used to take a bus to work, sometimes the bus was coming as I was walking to the stop; sometimes I would wait a long time. Sometimes not very long. There was no observable bias.
If you are talking about spherical-cow style poisson buses, yeah (that's what the author means by "reasonable assumptions). But as the author concludes, bus arrival times are not well modeled by a poisson process.