
Inequality and Mass Transit in the Bay Area - dangrover
http://dangrover.github.io/sf-transit-inequality/
======
kevinpet
"The results show the Bay Area's economic inequality and its relationship with
transit and urban form."

I'm not sure what they are supposed to show. It seems to show me that there's
very little relationship between income and mass transit service.

~~~
patmcguire
Yeah, the Bay Area has too many rich outer areas to make any sense of it. The
one about New York from about a month ago is a lot more clear - rich center,
poor periphery. <http://www.newyorker.com/sandbox/business/subway.html>

~~~
tomkarlo
In both NY and SF, if you keep going further out (beyond the base mass transit
system) incomes start to rise again. There's basically the rich central city,
the poorer surrounding "urban" neighborhoods, and then the rich suburbs.
Chicago too.

~~~
ripter
Indianapolis too

------
null_ptr
Could we please get username.github.io subdomain support on HN? Unlike the old
github.com, these are all user content so it makes sense to distinguish them
just like for wordpress.com.

~~~
saraid216
I added this to the Feature Requests thread, but I have no clue if pg actually
looks at that.

------
newbie12
Income and wealth are not the same thing. Wealth is a measure of total assets,
while income is annual wages and investment income. A lot of high income
people carry huge student loan and housing debt and are not wealthy.

~~~
mikeash
Housing debt won't drag your total assets down unless you happened to buy a
house that's now underwater. The usual situation is for the difference of
house value and mortgage amount to be significantly positive.

Income and wealth aren't the same thing, but they are _highly_ correlated.

~~~
Afforess
>Income and wealth aren't the same thing, but they are highly correlated.

No they are not.

 _According to the US Treasury:_

"These low realized rates of return call into serious question the use of
realized income from capital as part of any measure of well-being or ability-
to-pay. For owners of capital, _economic income may have little relationship
to realized income_ , and rates of realization may vary according to the
assets they hold."

 _According to the Federal Reserve:_

"...very wealthy people try quite hard to minimize their income."

[1] [http://www.treasury.gov/resource-center/tax-policy/tax-
analy...](http://www.treasury.gov/resource-center/tax-policy/tax-
analysis/Documents/ota50.pdf)

[2]
[http://www.federalreserve.gov/econresdata/scf/files/wealthin...](http://www.federalreserve.gov/econresdata/scf/files/wealthincome7web.pdf)

~~~
mjn
They are actually quite correlated. Your links argue that there are cases
where wealthy people don't have high income and vice-versa, but that doesn't
disprove the quite strong general trend.

See table 5 (pg. 36) here, which shows both net worth by income percentile,
and income by net worth percentile, to see the strong relationship between
being high-net-worth and high-income:
[http://www.federalreserve.gov/pubs/feds/2009/200913/200913pa...](http://www.federalreserve.gov/pubs/feds/2009/200913/200913pap.pdf)

Example numbers: the top 1% of 2007 income-earners held 26% of U.S. wealth,
and the top 1% of households by net worth in 2006 earned 16% of the year's
income. Meanwhile, the bottom 50% of income-earners held only 14% of U.S.
wealth, and the bottom 50% by net worth earned only 22% of U.S. income. The
general trend holds in the in-between categories as well: the 95-99th
percentile of incomes hold about twice as much wealth as the 90-95th
percentile, etc.

So the relationship is not perfect, but taking the groups in aggregate,
higher-income-earners control considerably more wealth than lower-income-
earners. Some of the later figures in the document explicitly plot some
ratios.

~~~
Afforess
The correlation exists, but is not extremely strong, as the GP stated, which
is what my point was. I never said there was no correlation.

~~~
mjn
How is it "not extremely strong"? If there were _no_ correlation, you would
expect the top 1% of income earners to hold 1% of net worth, because there's
no expected relationship between being high in income and high in net worth.
If there were a correlation but a _weak_ one, you might expect them to own a
few times the uncorrelated amount. Maybe they'd own 2% of U.S. assets (
_twice_ the otherwise expected amount), maybe even 5% ( _five times_ the
expected amount). Either of those would be enough to establish a clear
relationship, but a weak one.

But they actually own 26% of American assets, _twenty-six times_ the amount
you'd expect in the uncorrelated case! The top 1% of Americans by income own a
full quarter of all the country's assets— stocks, bonds, real-estate, etc.
That seems like a pretty strong relationship.

~~~
enoch_r
This just isn't good statistics. The fact that the _vast_ majority of
hysterectomies are performed on women doesn't tell us anything about the
probability that a randomly selected woman has had a hysterectomy; the fact
that the vast majority of wealth is held by the rich doesn't tell us anything
about the probability that a randomly selected rich person has a lot of
wealth.

~~~
mjn
Err, the fact that the vast majority of hysterectomies are performed on women
_does_ tell us that hysterectomies are strongly correlated with sex. That was
the initial dispute: whether wealth and income are strongly correlated or not.
The distribution is a separate argument, although I'll note that the PDF I
linked has some data on the distributions as well, and it does not support the
"weak link between them" argument. The proportion of very-wealthy people with
low incomes, and very-high-income people with low wealth, is actually quite
small.

~~~
icambron
I took a quick crack at the correlation implied by table 5 in that paper you
linked. Assuming wealth was constant across the ranges specified, I came up
with a correlation of 0.45. So meaningful but not high. I personally suspect
that it's actually much more highly correlated and that fact would come out if
I didn't have to assume wealth was constant for 0-50 and 50-90, but it would
be conclusory for me to incorporate that into my numbers.

------
NelsonMinar
Nice visualization, I particularly like the link between the income graph and
the map on the upper right. Built using D3, TopoJSON, Bootstrap, Angular, and
JQuery. (That's a lot of frameworks!)

------
YokoZar
Slightly more interesting to me would be the median income of a person getting
off at a particular stop, rather than the folk who live there. Plenty of
people work in the poorer parts of the city and live in relatively expensive
neighborhoods, and plenty of people also do the inverse.

~~~
dangrover
Wish I could find that data. Some stops are weird because they're all hotels
(Powell) or all businesses (Montgomery).

------
nextstep
It's cute that someone made this in response to the New Yorker piece. San
Francisco always wants to feel like it's one of the big guys like NYC.

If anything, this transit/income data shows how little of a correlation there
is between the two. This doesn't suprise me: most people in SF have cars, and
muni/BART are embarrassingly awful compared to the big the cities in the US.

~~~
WildUtah
_compared to the big the cities in the US_

cities ==> city

The only big city in the USA with decent transit is NYC. San Francisco is
comparable to Boston, DC, and Chicago and far ahead of LA, Dallas, and Miami.

By world standards, NYC is barely average and the rest of the country has no
transit system at all to speak of.

~~~
bradleyjg
London, Paris, Seoul, and Tokyo all shut down most of their mass transit
system overnight. NYC runs it all night long (albeit with fewer trains and
some diversions). By the important metric of availability that puts it among
the top.

~~~
WildUtah
Indeed, 24-hour rail service, even though it runs at infrequent and irregular
intervals, is a big point bringing NYC up to barely average in its peer group.
Other measures like coverage (suburbs count), travel times, intermodal
operations (awful airport connections), connectivity (think Jersey), and
jitney service and taxi availability drag NYC's score down.

I can count NYC as barely average only because the peer group includes not
only London, Paris, Seoul, Tokyo, and Osaka but also Rio de Janeiro, Moscow,
Istanbul, Mexico City, and Buenos Aires. If I'd just used your list, NYC would
be dead last in almost every category.

~~~
drstewart
If you're going to use taxi service to judge how good public transit in, your
methodology is probably flawed.

------
binarycrusader
One thing the graph doesn't represent is how often a particular train stop is
serviced.

As an example, since (August?) of 2005, Atherton only has weekend and special
event caltrain service. You could argue that shouldn't be surprising given the
graph (median income of ~$193K), but it's still an important piece of missing
information.

It would have been nice to see a graph based on total number of stops
scheduled for a station overlaid with the median household income as it is
currently graphed for caltrain.

------
kevingadd
I've lived here for a few years and the median income in Redwood City still
surprised me. Wow.

Click the Caltrain 'Local' route and see for yourself. I had to double-check
it since it seemed so implausible to me, but it seems to be roughly accurate.

~~~
eande
I am not sure where their data is pulled from, but the numbers are not
correct, at least for Redwood City. <http://www.city-data.com/city/Redwood-
City-California.html> Redwood City median income in 2009 was $67,611.

~~~
dangrover
Data is for the census tracts where the stops are located, which admittedly
makes more sense for subway/streetcar/bus stops than commuter rail stops
(where people drive from other tracts and park at the stations). The BART is
particularly troublesome because it takes on characteristics of both types of
transit.

A data science friend of mine said we should do a "watershed" model where we
define an area where people flow into a stop, but I'm not sure how best to do
that! Maybe someone smart can fork the project and improve on our methods.

~~~
djcapelis
Probably you could just bunch census tracts within a certain distance of any
stop and count them as being part of their nearest stop and then average them.
(Weighing correctly for population differences.) If you wanted to do it
better, you could have the weight for each census tract fall off with distance
from the stop.

------
charlieok
If you follow many blogs of people in the data visualization community, you'll
see right away how often they talk about an iterative process in drawing out a
story.

There's a big exploratory component, where they investigate possible
approaches and try and find something interesting they can show.

On the one hand, you want some significant features and relationships that
exist in the data to be apparent to any intelligent reader who spends a little
time studying your visualization.

On the other hand, you don't want to distort the data, or impose an
interpretation on it that isn't warranted.

For a well-done visualization, there is definitely a lot more work going into
the final product than simply plotting some dimensions against some other
dimensions.

As an example, the New York Times pours a lot of money (and therefore talent
and person-hours) into their visualization work. Some of the behind-the-scenes
of that operation is blogged about here: <http://chartsnthings.tumblr.com/>

------
paul9290
I visited San Fran last week and with each visit I'm shocked by the plight of
the homeless there.

I've lived(NYC, Philly & more)in and been in many cities across the US and
never witnessed this on such a scale/epidemic.

It made me wonder what the government is doing there to help with what I see
as an epidemic?

~~~
tlrobinson
Giving people money to be homeless probably doesn't help.

 _"The city of San Francisco, California, due to its mild climate and its
social programs that have provided cash payments for homeless individuals, is
often considered the homelessness capital of the United States"_

[https://en.wikipedia.org/wiki/Homelessness_in_the_United_Sta...](https://en.wikipedia.org/wiki/Homelessness_in_the_United_States#San_Francisco)

~~~
tomkarlo
That wikipedia page is a train wreck. It's pretty clear there's an edit war
going on between two political factions that are uninterested in keeping
things factual. Wikipedia is great but in some situations the quality of the
content goes way downhill at the hands of people with alternate agendas.

------
wgoodwin
Well done (I'd say better than the original).

One of the things I find most surprising is the data on the bus lines. I'd
assumed, as a recent LA transplant, that the bus lines would generally serve
worse off neighborhoods (it is always thus in LA; minority and lower income
neighborhoods get mediocre bus service, while wealthier neighborhoods get
expresses and light rail).

Of course, it would be interesting to overlay the stops of the various
corporate buses on top of this information. My guess is all those high points
have private alternatives serving them.

Final point: this might be best for the questions it raises. How does service
compare across lines? How many people does a line move and how fast? How much
is the line getting subsidized (BART, I'm guessing, crushes the others in that
regard).

~~~
potatolicious
The bus routes are indeed IMO the most interesting bits of the data.

The trick with San Francisco is that because of the buckshot nature of public
housing developments in the city, poor areas as mixed in surprisingly evenly
with wealthy areas. This creates a lot of negative effects for residents - the
expensive and trendy Hayes Valley for example, is right next door to an
_extremely_ high-crime area, the Western Addition. Keep going a bit further
and you hit the Fillmore, which is again a wealthy, trendy area.

SF does this at micro-scale. In a given neighborhood there can be extremely
good blocks that are directly next to extremely bad blocks. It's not hard to
walk 300 feet and end up in a completely different-seeming universe.

One thing that's interesting to note is that SF buses stop _very_ often, so
the highs and lows aren't really spread across a large geographic distance,
they are often separate only by a block or two. The "cliffs" in the graph
really _are_ that steep when you project it onto a map.

~~~
guelo
Having crime spread into rich neighborhoods instead of being concentrated in
only poor neighborhoods seems like a positive for residents overall since the
more affluent neighborhoods have more resources to deal with it. Unless when
you say "residents" you mean only the rich ones like yourself.

~~~
djcapelis
It's an interesting question of how that shakes out! Here's the map of police
districts: <http://sf-police.org/index.aspx?page=868>

You'll notice the Tenderloin (a high crime area) has its own station
partitioned off from the rest of the districts. The others are also
interesting. The Mission station for instance, serves a really broad community
which includes both the Mission District (currently undergoing gentrification)
and the Castro (gentrification complete). On the other hand, districts like
Bayview, Ingleside, Taraval and Richmond are just larger, generally more
residential and less in the center of everything. The Park and Northern
districts on the other hand, are mostly affluent with a few outliers.

------
ZanyProgrammer
I'm a little disappointed, insofar as this data tells us nothing we don't
already know-the area around Fremont BART is more affluent than Fruitvale,
OMG! But it is a nice bit of data visualization, and I applaud them for that.
I live in the Bay Area, develop software, and rely on public transit everyday.
I definitely see improvements in what data should be presented.

For example, for Caltrain, break each graph down further. Find the number of
passengers who board at Baby Bullets, NB and SB, for the weekday commutes.
What you need is more demographic data about the riders of each system, at
specific times. Anyone whose been on Caltrain can tell you that the weekday
commute is very much a white collar commute. Finance, Law and Tech going north
to SF, mostly tech going south to MV (by the time baby bullets get to SJ
Diridon, they are very empty). Palo Alto is an outlier, a lot of people
commute from points south to Stanford, which obviously isn't a tech company.

Having said that, since I really enjoy developing transit software, I'm really
going to take a look at the code and see what I can do. Its a really good
start, and I'm happy they posted it to HN.

------
tome
I don't see what this has to do with mass transit. It seems to be a gimmick.
Why not just plot this information with a (two-dimensional) heatmap?

~~~
dangrover
It is a gimmick, but if you're not a car person, your mental model of the
place you live in is probably closer to a topological transit map, since you
largely get around by hopping between nodes. It's easier to connect with data
displayed that way. And it's also interesting to see what the areas are like
near the nodes you pass by every day on your commute but don't get off at.

~~~
tome
A heatmap superimposed on a topological transit map would also have been fine.
I just don't see what the one-dimensional version buys.

------
greggman
The data seems a little suspect. For example the L line shows Montgomery
Station has having low income. AFAICT there's (a) almost zero housing at
Montgomery Station and if there is it's most likely super rich as it's
directly in the Financial District.

My only point is before I can even try to get any meaning from this I need to
know the data makes sense. Maybe there are a bunch of low income apartments
near Montgomery Station but if so they sure are well hidden.

edit: Checking the Fremont line it shows the median income at Montgomery
Station as $112k where as the L line shows the same station as $23k. Something
seems wrong or else I don't understand what it's showing.

~~~
perryh2
I also _highly_ doubt that the median income for Union City is $138k/year.

------
scottshapiro
Given the volume of Peninsula-shuttled tech workers living within a mile of
16th and 24th st bart stations, I'm surprised to see those so low. Garbage in,
garbage out from census data for transient, high-rental areas like the
Mission, Noe, Castro, Bernal, Potrero. Indexing to rental rates in these areas
is likely more reflective of wealth (via affordance as a proxy)
[http://sfist.com/2013/03/07/map_average_rent_for_1br_in_san_...](http://sfist.com/2013/03/07/map_average_rent_for_1br_in_san_fra.php).
Awesome visualization though.

~~~
azernik
Within a mile doesn't matter - the OP is measuring by census tracts, which are
quite small. For example, if you zoom in on this
[<http://projects.nytimes.com/census/2010/map>], you'll see that the Mission
BARTs' census tracts (201 and 209) extend to only within a couple of blocks
from the stations, which seem from my daily commute to be the poorest and most
run-down parts of that area. (The tracts are even small enough that there's a
separate tract - 208 - for the stretch of Mission between the two BARTs.)
While some of the tech workers I know do indeed live between Van Ness,
Valencia, Cesar Chavez, and Market, a lot live a few blocks east or west of
that line, e.g. on the west side of Valencia, or in the area between Folsom
and Potrero.

A side question, though - is there a specific year when the shift of tech
workers to SF picked up steam? I'm finding a wave of articles complaining
about the Google shuttles in 2012-13 (about when I moved out to SF), but I
don't have a good feel for how far along the process was at that point.

~~~
scottshapiro
Fascinating.

Based on rapidly increasing rents over the past 2 years in these
neighborhoods, tech worker density has increased a lot. Anyone living in these
areas in 2009-2010 will tell you that they'll never break their lease because
market rents are 50-150% more than what they're paying due to strict rent
controls.

------
jcomis
Somewhat related: <https://vimeo.com/63147860> a 24 hour visualization of SF
public transport ridership. Each circle represents a stop.

------
greghinch
I've been wondering a lot recently how long it will be until the Tenderloin is
completely gentrified. If SF expansion remains unchecked, it has to be just a
short time, as it's literally become a pocket of poverty surrounded by yuppies

~~~
WildUtah
It's been a pit of crime and poverty surrounded by opulence for twenty years
now. The City's government works hard to preserve the Tenderloin as it is by
blocking development and gentrification with building codes, tenant rights
measures, permitting, policing strategies, homelessness subsidies, and more.

~~~
greghinch
The Mission has only fairly recently become young, white hipster-ville though.
Soma seems to go up and down with the tech scene, so assuming the current
growth isn't the same kind of bubble (I don't think it is), that is just going
to continue to climb as well. Certainly if all those people gentrifying the
surrounding areas stay put for any length of time, that kind of collective
monetary influence is going to change any city policies that may keep the
Tenderloin how it is

------
gweinberg
A lot of the people that live near the stops in downtown San Francisco may be
pretty poor, but I think most of the people using BART at those stops don't
live there, they just work there.

------
cmelbye
I thought it was interesting how the CalTrain curve flattened out when you
look at the bullet or limited service graphs instead of the local graph.

~~~
ZanyProgrammer
That's because Baby Bullets hit stations that tend to be in more affluent
areas. Palo Alto, Mountain View, Redwood City, San Mateo, etc. Certainly
Bayshore (Visitacion Valley) isn't dragging those down.

------
ultimoo
Wow, the most striking is the sudden jump between Redwood City and Atherton on
the Local Caltrain route. It jumps from ~30K straight to ~193K.

~~~
mjn
That's partly true, though the area _right_ around the Redood City Caltrain
(this uses census tracts, not the whole city) makes the difference much larger
than if data for Redwood City as a whole was juxtaposed with Atherton as a
whole. The area right around the Redwood City Caltrain is a bit sketchy,
particularly on El Camino (the downtown on the other side is nicer). Not too
sure why it hasn't gotten nicer, since it's so convenient to train service (30
mins from SoMA on the baby bullet Caltrain).

------
gcb0
heh, you should try L.A.

the buses that go to the poor neighborhoods doesn't even ride on the main
streets when in the good parts of the city. Also, the poor people buses have
tinted windows!

if you can find maps, compare Metro ($1.25, short routes) routes with Dart
($0.75?, long routes) ones.

------
dimva
Is this median personal income, or median household income? The page doesn't
make that clear.

~~~
dangrover
Median household.

------
irollboozers
There are people on the Bayshore Express line living below the poverty line.
Wow.

~~~
djcapelis
This isn't that surprising, Bayshore is not like most SF neighborhoods and has
some interesting patterns.

------
greesil
Yes, Oakland is poor. Thanks for sharing.

------
joshrotenberg
typo in the first sentence:

"both extreme povery and wealth, ..."

~~~
bitanarch
One thing I don't understand though - how can the people in Atherton afford
their houses with "only" $200k income? (yea $200k is a lot, but those houses
cost a freakin' lot more!)

~~~
fyi80
200K income, you can easily afford to allocate 25% of that to a mortgage
interest (plus 10% for principal, taxes, etc), $50K/year

3.5% mortgage -> 50K/yr * 100% / 3.5%/yr = $1.5M house price

interest rates are so low these days, you can spend 50% more on principal for
the same interest, compared to 2006 when rates were ~6.25%.

~~~
BillSaysThis
$1.5M doesn't really buy a lot of house in Atherton, at least more than a
block or two from El Camino Real.

------
fyi80
Another horrible repeating of the "income inequality" trope, and then showing
graphs of median income that don't get near the top l% of US earners, while
citing a stat that the _top 1%_ of earners are gaining.

~~~
djcapelis
Do you honestly expect whole census tracts to have median incomes at the level
of the top percentile?

A factor of ten difference between census tracts in the same city is worth
examining.

~~~
tomkarlo
Isn't that what you'd expect? There's a factor of ten difference these days
between the poverty line and a fairly typical white-collar income (call it
$120K.) Is it really surprising, or telling, that at the tract-level you'd
find a factor of ten difference in a major city? Is there a major city where
you don't see that?

~~~
djcapelis
You usually see a gradient where the tract-to-tract variance is less
pronounced. A city will almost always contain tracts where there is a 10x
difference between the highest and lowest numbers, but it is not as frequent
that those tracts will be located right next to each other. There are several
other cities which are like this, Delhi and to some extent Philadelphia both
come to mind and I think it is notable there as well.

And, for whatever it's worth, I also think the variance even when the tracts
are not next to each other is _equally_ worthy of examination. Just because it
is common does not mean it isn't something we shouldn't talk about. In many
ways, when the neighborhoods are next to each other it is a good thing for
visibility.

~~~
tomkarlo
New York has this... I don't think it's atypical. Tracts are so small that it
only takes a single development to raise or depress median income grossly.
Take a look at the middle of Manhattan: there are tracts with $10K next to
tracts with $120K, but they might have as few as _20 households_ in tract,
which means that a single building with 11 low-income tenants would pull the
median down to the poverty line.

[http://www.wnyc.org/blogs/wnyc-news-
blog/2011/dec/08/census-...](http://www.wnyc.org/blogs/wnyc-news-
blog/2011/dec/08/census-locates-citys-wealthiest-and-poorest-neighborhoods/)

