Hacker News new | past | comments | ask | show | jobs | submit login
Simple mathematical law predicts movement in cities around the world (scientificamerican.com)
120 points by iamwil 11 days ago | hide | past | favorite | 33 comments

> researchers discovered what is known as an inverse square relation between the number of people in a given urban location and the distance they traveled to get there

The found that the frequency of visiting a place, multiplied by the distance traveled (rf) forms a stable parameter which can be used as a single dependent variable. There is then an (approximate, statistically fitted) inverse square law involving this combined variable. Or two square laws.

If we hold frequency constant (say "once a month" or whatever), then the number of people visiting some place drops off inverse square with distance. If 400 people are willing to visit some place once a month that is 10 km away, about 100 once-a-month visitors will come from 20 km away.

Or if we hold distance constant: if 400 people are visiting some place that is 10 km away once a month, about 100 will be visiting twice a month.

> It accurately predicts, for instance, that the number of people coming from two kilometers away five times per week will be the same as the number coming from five kilometers twice a week.

It doesn't predict this; rather this frequency-distance product being a stable parameter is a discovery from the data, on which the formula is then based. I.e this frequency-distance product becomes a model assumption baked into the formula, not a prediction.

I interpret the inverse square law as being the direct result of the area of a circle scaling as the square of the radius. Imagine dividing a given circle overlaid on a city into fixed sized area units. If you randomly visit a unit, the probability of visiting a particular unit scales as 1/area.

Note that this derivation is essentially identical to showing that electric field falls off as 1/(r^2). In that case the area refers to the area of a sphere through which field lines must pass.

Some of the specific examples in the article seemed off to me too. However, I did love the book Scale by Geoffrey West who is one of the authors of the paper discussed in the article.

At the very least, the book is a great introduction to the many different aspects that scale up with city size.

Good description — I also thought that the example was off.

I'd be very interested to see how this law holds for travel destinations, both from the POV of the destination (national park, ski area, attraction), and from the POV of occasional travelers.

> researchers discovered what is known as an inverse square relation between the number of people in a given urban location and the distance they traveled to get there

I'm highly skeptical of this result as my lived experience is that travel time is so much more important than distance.

There are two parts of my city that I like to visit that are roughly equidistant from my home. One can takes 20 minutes to arrive at, the other 45. Can you guess which one I visit more often?

This is known as a "Gravity Model" in the transportation literature and it's very far form being a new discovery. It's been around since the 1930s under various guises (https://en.wikipedia.org/wiki/Gravity_model_of_trade)

We used the same concept in our 2009 paper (https://www.pnas.org/content/106/51/21484) but the exact functional form of the distance dependency (1/r, 1/r^2, 1/e^r, etc) varies with their exact definition of city due to the Modifiable Areal Unit problem (https://en.wikipedia.org/wiki/Modifiable_areal_unit_problem). In our specific case (cities defined as Voronoi cells centered around airports) the dependency was exponential.

Presumably that just shows that it's easy to fit things with inverse-power functions, rather than revealing some fundamental truth about the behaviour the equation is modelling.

I think that both this and what the paper says can hold true.

The paper is saying that from the point of view of the location, the distribution of effective distances traveled is invariant. So if you really like the 45min part of your city, then every one closer really loves it, so visit it much more than you do. Therefore the law holds true, even though on an individual basis you obviously go to the closer parts more frequently.

The fact that you might be prepared to go all the way to e.g. Disneyworld means that the people who live closer are massively more likely to go there to the point where people who live 5 miles from it go there all the time.

Averaged out over all inhabitants of the city and all areas of the city it wouldn't be surprising if it still held. What is accessible from your dwelling may be less accessible to someone else the same distance away but approaching from a different direction.

From the abstract:

> we reveal a simple and robust scaling law that captures the temporal and spatial spectrum of population movement

They throw around the word "temporal", but it's not clear exactly how they incorporate the time element without taking the plunge on the full study.

In the frequency of visitation, so that sentence seems to be a bit confusing.

Another interesting 'law' of this kind is Zipf's law observing that the largest city is roughly twice as large as the second largest, three times as large as the third largest in any given country, and so on. the same relationship shows up in word-usage frequency in languages.

What countries are consistent with Zipf’s law?

Not USA, England, Australia, Thailand, China, England, France.

US is quite close for the top 10 cities (~1 million 'too many' people in New York, 200k too many in San Diego, Austin and Dallas), but there are FAR too many cities with a population around 250-500k

England would be surprisingly close, if London had been half the size it currently is. If you took half the population of London and distributed it proportionally out among the rest of the country then you get quite close to a Zipf law distribution.

France doesn't really follow Zipf law for its 5 largest cities, but cities 6-15 follow quite closely.

Australia I 'weird' in that it's two largest cities are basically the same size, but if we ignore Melbourne for while the next few cities line up quite nicely before we start seeing a much faster drop off in city population than Zipf's law would predict

Thailand is way off, since it has 1 massive city followed by 10 cities of more or less the same size.

So yea, no country follows exactly, but it's closer in many cases than you might expect. Plus comparing the actual distribution to Zipf's law is an interesting way of comparing urbanization in different countries

In the US, for metros you have NY at 20M, LA at 13M, and Chicago at 9.5M.

But if you go by city - NYC is 8.5M, LA is 4M, Chicago is 2.7M.


New York-Newark-Jersey City, NY-NJ-PA [19,216.18]

Los Angeles-Long Beach-Anaheim, CA [13,214.8]

Chicago-Naperville-Elgin, IL-IN-WI [ 9,458.54]

Dallas-Fort Worth-Arlington, TX [ 7,573.14]

Except for many primate cities: https://en.wikipedia.org/wiki/Primate_city#List

> Studying anonymized cell-phone data, researchers discovered...

Anyone else get creeped out by the fact that this is becoming normalized?

(Background: it is impossible to anonymize location tracklogs.)

It's totally possible to to anonymize location data for this study. The idea is that while the mobile IP has location track logs instead a fair sampling of actual travels is provided where a travel is two locations some person visited at some point in time after each other where locations are treated identical if they fall in some grid box of n km coursity.

In fact such datasets are available by Telecom providers.

> Studying health records

> Studying spending patters

> Studying voting intentions

> Studying twitter

The only creepy thing is if the data is being collected via dubious means. If they sign up a few 1000 participants, use their data in a transparent manner, I dont have a problem.

If however, they buy the data from the lowest bidder on the internet who has a free VPN app that logs location for resale, they yeah that is shitty.

At least twitter is public by default and everyone participating has that expectation, while everything else is not.

Big difference studying the language of published books by authors vs. their private conversations in their homes.

Walking down the street is public. Are you ok with researching behaviour using tracking location based on observing this public behaviour?

Twitter is a conscious act of publishing info into a public sphere, and the public nature of it is very obvious and upfront. I don't think it's OK to dox someone if they publish under a pseudonym on twitter although.

I'm not publishing my walking info that happens in public, even if the information is public, and someone has to do extra effort to get my unpublished information. I'm also not publishing my identity when I walk down the street, someone would have to dox me to figure out who I am on top of that, unless I was a public figure and they already knew who I was.

On the other hand if it's general information about a spot, I don't really mind too much. This happens already with traffic speed analysis on public roads for example for cars.


And there are many similar infrastructural analytics I don't care much about, as long as it's done to maintain a system and not to track identities, much like server traffic usage & CPU usage monitoring on servers.

How is it impossible to anonymize location tracking? Seems like if randomize starting and ending location by a couple hundred feet, throw away data with <50 users , and don’t use specific dates It would be pretty anonymous.

This is starting to feel like Asimov's psychohistory

>“We tend to think that there are lots of contextual aspects that affect the way we move, such as the transportation system, the morphology of a given place, and socioeconomic aspects. This is true to some extent, but what this shows is that there are some robust laws that apply everywhere.”

I'm no computational social scientist, but this analysis doesn't make sense. In fact it would seem to me that if the study took those variables into account the model would become more predictive. It's idealizing the analysis.

Can we get a rule where the adjectives like startling, striking, profound etc get applied after an application has been demonstrated. Speaking as someone who suffers from adjective overload.

Geographer works a lot of these questions since 1900's, and there are lot of works/simulation in the domain using Multi Agent Sytem (see journal of artificial societies https://www.jasss.org/JASSS.html).

Mathematican and physician often forget to cite their work :'(

The gist is that people bounce around to places with more people. Kind of like gravity.

Not very like gravity. They explicitly say "outperforms gravity models". Also the economic concentration laws behind this would often follow power laws - so maybe the inverse square is just a 2nd order approximation rather than a real fundamental principle.

Why does the SciAm web page gobble CPU?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact