Hacker News new | comments | show | ask | jobs | submit login
Show HN: Crime Doesn't Climb in San Francisco (github.com)
145 points by gwintrob 1504 days ago | hide | past | web | 74 comments | favorite

There are a lot of confounding variables here, and others have mentioned a few.

One I have not seen is "proximity to major arterials". Sometimes crime concentrates around major streets and parking lots where cars can be stolen or broken into. (The OP mentions "grand theft from locked auto" as being ~10% of crimes, for example.)

Another common crime, shoplifting and petty thefts, occurs around strips of shops, urban malls, transit stations, or bus stops.

These crimes really don't have to do with altitude, and they are common enough to really affect results. The reason I know this is by inspecting the LAPD compstat maps, which are superior, in many ways, to the linked one. One place to see them is:


This is Hollywood. You can notice the concentration of crimes along Hollywood Blvd, or the other major east-west streets. (The zoom tool is on the right, and it may be good to increase the time span to a month.) Inspecting the causes show they are often thefts from vehicles, etc., as mentioned above.

This is invalid. The numbers calculated are counts of crime incidents, not crime rates. A km^2 block with ten times the population density will have ten times the number of crimes; this doesn't mean it is ten times more dangerous!

A block with ten times the population density and four times the number of residences that has ten times the number of home break-ins is more dangerous - there are more crime victims. I suppose this means that the type of crime is important.

I live in the low-crime area of SF depicted in the graph, and I can confirm that most definitely there is a much lower population density than in other areas of SF, including places like PacHeights. It is mainly residential, with the houses being roughly 100 years old. Most people in SF that I know have never been to that area, mainly because there's no reason to unless you live there.

But then again, the crime rates in Outer Sunset are relatively low as well, and they are at roughly sea level, and they are at stark contrast to SOMA, etc.

Mind you, the houses in that area start at $900k for 2br/1ba houses, and 1.2M for 3/2 of reasonable size. There are very large houses as well for $2M+, especially in St. Francis Wood, and Forest Hill.

Good point. Also rich people live in the hills, Of course there is more crime in poorer areas with more density and with less care from police. Just put SF in the title and you'll still get a lot of self-loving sf people to click.

Crime per km^2 is just as valid as crime per person.

If I want to know the chances of someone breaking into my apartment I want the data per-apartment. If I want to know the same for cars I want the data per-car. If I want to know my chance of getting shot I want the data per-person. Overall crimes per person approximates what we care about much better than crimes per unit area.

> If I want to know my chance of getting shot I want the data per-person

In reality though, it isn't as simple as that. Sure, I don't want to be shot. I also don't want other people to be shot a lot in my neighborhood. If I walk to work and see a car parked on the street has been broken into, that bothers, even though my car is in a secure parking garage.

I don't want to be the victim of crime, but I also don't want to live amidst crime. I don't want to see it, I don't want to hear about it, and I don't want to think about it. Crime per km^2 is nearly as important to me as crime per capita.

To put it another way: if you live alone, statistics measured per-person are directly applicable to the chance something will happen to you. If you have a family who lives with you, though, the combined probability distribution of something happening to someone in your household changes quite dramatically.

These are rare events, so the chance of something happening to someone in your family of N is close to N times the chance of it happening to you.

It's valid if you're a square kilometer of dirt and you're afraid of being mugged on.

Or, I suppose, if you're trying to decide which square kilometer of dirt to build apartment complexes on based on projected property value trends (where crime is always a large factor.) These tend to be the people who pay for this sort of data (and thus incentivize its collection), which is why it's put in their terms.

In 2011, NYC had 0.65 murders per square kilometer (515 murders in 784 sq km of land).

In the same year, Gary IN had 0.23 murders per square kilometer (30 murders in 129 sq km of land).

Do you feel 2.5 times higher to be murdered in NYC than you do in Gary, IN?

For comparison with the traditional per-100,000-people rates, NYC had a rate of 6.3 in 2011 versus 37.2 in Gary.

If you scroll down you'll see they adjust for that in a followup graph ...

No, they normalize by land area (km^2).

I'd like a map overlay of median(not average! that means nothing in SF!!!) income of those areas. Something tells me higher elevation = way more expensive, especially here in SF.

I don't think the common thief is wandering about on billionaire-row[1]. I think this has more to do with wealth than land-elevation itself.


I lived in a crappy $500/month place in Chinatown for a year, ending this spring. It was the lowest rent building I've ever heard of in SF and most my neighbors were elderly immigrants with limited English (or even Mandarin!) language skills.

It was way safer than any of the three directions that went down hill (i.e. towards Market, towards the Embarcadero or towards fisherman's wharf). There was also a notable absence of beggars. I think there really is something to it just being too much of a PITA to hike up all those hills.

AFAIK, that's traditionally the way its been in SF. Altitude correlates with wealth.

Potrero Hill might be the exception.

It happens in the southern sections of the city too (which are not included on the map). Ingleside Terrace is considerably more expensive than the OMI (ocean view, merced heights, ingleside). Similarly, Mission Terrace, which not a posh neighborhood, is nonetheless a bit more expensive than the Excelsior, which is elevated ground toward McLaren Park.

Having lived in Potrero, I've noticed that the houses get significantly ritzier towards the top. But the side facing away from the city is a pretty steep downgrade.

That's exactly the point behind "Crime Doesn't Climb" - it's a reference to the wealthy being "Up the Hill". Though there are some flatter high wealth areas in the city, the hills are much more exclusively wealthy.

To your point on mean versus median, I agree that median is more accurate. Even better would be homeless %, or % under poverty.

This is why I'd be interested in looking at the graph for the neighborhoods south of 280, particularly in the eastern section of the city. In many areas, property values drop and elevations rise as you head south.

That doesn't necessarily mean more crime, nor does it necessarily mean an identical mix of crime. That's a big part of why it would be so interesting to take a look.

I really like the work here. Very cool graph and visualization, and if there are things I'd like to see, it's not that they are "missing", it's that the approach is triggering some ideas for how to look at and interpret the data.

What about weighted by population density?

Great point. There are a lot of interesting ways to analyze this data (e.g. population, public transportation, home prices, etc.)

More accurately, the MUNI doesn't climb. In the nicer areas there is no light rail service, only a few filthy buses.

Are you implying that public transportation causes crime?

There's a widely-held view that public transit is associated with higher crime rates, but it's a myth:


In fact, the data shows that if anything, introducing more transit will reduce crime.

> In fact, the data shows that if anything, introducing more transit will reduce crime.

Careful. The provision of new or improved public transport facilities often coincides with, or precedes, the gentrification of an area.

I'd hypothesize this is the more significant causal mechanism at play; ne'er-do-wells driven out of an area by higher property prices.

I think it's just incredibly inconvenient to get from the tenderloin to the ritzy areas of SF by public transit. You pretty much need a car or car service.

The TL is immediately adjacent to the rity commercial area (Union Square), to the point where tourists accidentally walk there.

Bayview/HP aren't even that isolated -- the T muni line goes there, along with lots of buses.

At some level it's inconvenient to go anywhere in SF, but Pac Heights isn't too hard to get to by transit. The Marina is low elevation, but separated by some hills, but even that isn't inaccessible.

Well, there are cable cars but thugs and crooks don't like to ride with tourists. There are some social lines one simply doesn't lower oneself to cross (unless mugging them directly -- that's like social justice).

Ding ding ding, we have a winner.

I think grandalf is implying that public transportation tends to be an attractor to some people who are more likely to commit street crimes.

Notice the street. I'd say that financial crime is more likely to be done by Uber riders than bus riders in SF!

Financial crime is typically committed in offices in the financial districts of cities, not at the home addresses of the offenders. These districts are typically very well connected by public transport.

Who doesn't work from home on occasion?

Anyway, I wager that much insider trading is done from home rather than in the office. It is safer that way.

This is only SFPD data-I'm pretty sure various federal agencies investigate a lot of the white collar crime in the Fi Di

People being at places causes crime in those places, and public transportation helps people be at places.

Well, fine. If you don't adjust for population density, you are correct. The more people in an area, the more crime there will be in general.

It doesn't cause crime, but it can spread street criminals to areas they might not otherwise be able to access (conveniently enough to bother).

Perhaps it facilitates crime (petty theft, etc) especially considering the density of people in areas where public transportation is available.

Can the light rail trains even climb up the huge hills?

Interesting. I suspect elevation is highly correlated with land value, so perhaps that is the hidden factor here?

I'd like to see this applied to other hilly cities. Great work!

Can we see the exact same study, but with SEC indictments and tax fraud/evasion, etc. ?

I'll bet that "climbs" ...

Probably for the perpetrators, probably not for the companies involved (which are likely to have the same set of Delaware and/or Cayman mailing addresses) :-P

But sure, I would expect that white collar crime is unsurprisingly correlated.

Maybe not though - boiler room style phone banks of investor fraudsters are a stereotype for a reason, and those aren't necessarily the high rent end of town.

Are you counting the crime as happening at the place of business, or at the home of the perp? It'd be hard to quantify, as WHERE a white collar crime happens is not particularly material. You can cook the books at home or at work.

Data that reports per-person activity needs to control for population density in each section, otherwise it just becomes a population map.


Hmm... I don't know if equal-area tells you much. As you go higher, you also sometimes/often get much rougher terrain. Population density might be more useful?

Pete Warden would like you to use public APIs: http://petewarden.com/2013/09/09/why-you-should-stop-piratin... . Also, the association of low ground with poverty is old. Toni Morrison riffed on it in Sula: http://en.wikipedia.org/wiki/Sula_(novel)

Very good article! The Data Science Toolkit he pumps as an alternative to pirating google's APIs looks awesome, and deserves a direct link: http://www.datasciencetoolkit.org

Throwaway makes a very valid point that crime per areas isn't very useful. But even beyond that the map is also confusing because for the lower elevations it shows both lower areas AND higher area crime incidents. Then as it goes higher, it just shows the higher ones. In other words, it should really show an outer ring for the lowest elevation (since that's where the lower areas are) and then a smaller ring, etc. Only the highest areas would not have a hole in the center (most likely).

Think of elevation maps and how you would select out one range of elevation at a time - you'd end up with donut like rings. You wouldn't mark an area as 500 feet and show all the rings for 500 feet and up. Doing this makes it seem like there are far more crimes than actual for the lower levels.

How about showing it overlaid with a topographical map?

Yes, that would be a nice visualization. http://cartodb.com/ is a pretty powerful tool for these kinds of maps.

TopOSM is Open Data (OpenStreetMap data) and is my go-to place for finding out where hills are. http://www.toposm.com/ (For looking at hills in cities, click the "+" on the right to get a menu where you can toggle streets ("Map Features") on and off.)

This is a fun exploration, but there are just so many plausible additional factors, from population density, to SFPD's selective enforcement, to many others that can be at least as significant if not more so than this one.

This is not meant to be an exhaustive causal analysis. You try to control for land mass, but don't really mention anything else that might indicate elevation to be a less significant factor and I think that does a great disservice to what otherwise is an interesting exercise.

I don't think the article is trying to claim that elevation magically has a causal impact on crime rates. It just shows that there is a correlation and claims nothing more. The point is that there IS some underlying root cause (or an extreme coincidence) and that is an interesting point in and of itself.

"A great disservice"? Lighten up, this was a hackathon project, not a doctoral thesis. And the code is there for anyone to build upon using other factors.

I agree. This is a good example of a small project that was worth getting out there. The code is posted, so if it's triggering some ideas, that's kind of the point - by all means, run with it!

I suppose the the analysis would be better if it used Census Block Groups for the grid. That it could better account for population density. Unfortunately extracting CBG data is extremely painful. CBG data would also be slightly problematic in that it is residence based, thus a place like NYC's Time Square may show high crime rates per "resident", but is extremely safe.

Does anyone know where you can get a simplified CBG data dump? All you would need are boundaries and population.

This is interesting, but on my browser, the map cuts out much of the southern section of SF. Is it included in the numbers? This area may present a particularly interesting section to study elevation changes.


On "your browser"? It is an animated gif, not a browsable map. All browsers render the animated gif the same way.

Though you could just possibly forgive him given that Google Maps-style controls were left in the image?

I guess street crimes are more frequent in areas where there is an higher concentration of people/shops/bars etc.

I can't say I see a definite correlation with altitude: areas that are more secluded from the main SF buzz like the Marina and the west coast show very little crime, and they are basically at sea level.

You want population and crime rate included in the analysis? Fine. http://geosprocket.blogspot.com/2013/09/crime-doesnt-insert-...

Very interesting. Other comments have listed possible factors like population density, public transportation (MUNI). I'd love to see the correlation with:

- the derivative of elevation (i.e. the slope)

- the street lightning

What about adjusting for the greater probability that a given area is residential the higher the elevation? Perhaps more crime occurs outside residential areas. Just speculating here.

skwirl has the correct answer here, but it is buried in a sub-comment:

"I don't think the article is trying to claim that elevation magically has a causal impact on crime rates. It just shows that there is a correlation and claims nothing more. The point is that there IS some underlying root cause (or an extreme coincidence) and that is an interesting point in and of itself."


It it all JavaScript I'm looking at?

(I'm only ~halfway through the JS course on CodeCademy and don't know much else programming related).

Yes, this is all JavaScript.

Crime likes a way to escape. Going higher up might just give you fewer ways to get away.

The first thing on that page is a GIF. There is no explanation. It is not clear what the GIF is meant to be portraying. I even went on to read the first two paragraphs, which also didn't explain what the GIF was supposed to be illustrating.

After that I closed the page.

We were just noting the same about poverty in San Francisco.

Correlate with population density? Also, the time of day?

tl;dr Everyone in San Francisco, buy ladders.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact