
How ZIP codes nearly masked the lead problem in Flint (2016) - dll
https://theconversation.com/how-zip-codes-nearly-masked-the-lead-problem-in-flint-65626
======
forkandwait
I work for a social services agency, and my guesstimate is that 70% of the
managers think that zip codes are the answer to all their spatial analysis
problems -- they are very easy to collect, and they _feel_ right. These
managers don't really have any training in geography, so it isn't exactly
surprising, but I think there is a pretty strong committment to zip codes as
_the_ unit of analysis because it is the one thing they know and think they
understand.

For the folks who in this thread say "just use a point map of occurrences and
you would have seen the problem" you are right in this case, but it isn't
really enough in most cases.

For one, the boundary of the city limit needs to be shown to see the
correlation; perhaps one should map the point occurrences a la Snow, and then
overlay all the boundaries you have (mosquito district, city boundary, water
district boundary, county, etc...).

Additionally, a point map doesn't tell you rates of occurence, for that you
need a denominator, or you can't calculate things like number of occurences
divided by total population. One poster mentioned Census Tracts, and I would
agree those should be the default boundary, subject to more investigation --
tracts correpsond well to "natural" boundaries, they are mostly consistent
over census decades, they are mostly consistent with population (~5k), they
nest perfectly with other census boundaries, and they are the smallest
geography for which you can get mostly the full suite of population
characteristics like education from the ACS.

For some reason tracts feel just wrong to people. I have managers who reject
zip codes, but then fixate on weird systems like pixels or equal area hexagons
or whatever for which you can't get denominator data and which don't
correspond to anything real. I don't understand, though I don't think in
public policy / social work education anybody bothers to teach you shit about
GIS or Census data unless you know to ask specifically.

------
toast0
At a more general level, your mailing address indicates how to deliver your
mail; it does not indicate what city, if any, the delivery address is in.
Neither does your zip code (as illustrated in the article), your phone number
area code, or phone number prefix. All of these things indicate the location
of the mail or phone facility that services the delivery address -- or that
once did, before facility consolidation or expansion.

A related phenomenon is that different agencies have different boundaries,
despite often sharing the same geographic name. Cities, water districts,
sanitary districts, school districts, community college districts, counties,
congressional districts, etc each have their own boundaries, and they are
usually not aligned.

~~~
TallGuyShort
I encountered this the other day: friend of mine has a zip code associated
with a major city, and puts the name of that city in their mailing address.
But the land is not incorporated in any city. City government was trying to
enforce city laws against them and use the mailing address as justification.
Stupid.

~~~
niftich
Paradise, Nevada -- an unincorporated town [1] under Nevada law in Clark
County, which host all of the famous casinos of the Strip between Sahara and
Mandalay Bay, as well as over 200000 people -- is definitely a real place, yet
all domestic mailing addresses that are in Paradise, in ZIP codes like 89109,
89119, 89169 need to say 'LAS VEGAS'.

In Virginia, Henrico County, which surrounds the City of Richmond on three
sides, petioned the US Postal Service in 2008 to allow 'HENRICO' as an
acceptable label for affected ZIP codes that were used on its territory.
Establishing a shared identity was a factor, but so was the fact that in some
cases collected taxes were misappropriated to the wrong jurisdiction on
account of a label lookup from the mailing address [2].

[1]
[https://en.wikipedia.org/wiki/Unincorporated_towns_in_Nevada](https://en.wikipedia.org/wiki/Unincorporated_towns_in_Nevada)
[2]
[http://www.henricocitizen.com/index.php/news/article/address...](http://www.henricocitizen.com/index.php/news/article/address_change_provided_revenue_identity_3128)

~~~
_acme
Isn't the primary idea of an unincorporated town that it doesn't 'exist' for
purposes such as mail?

~~~
TallGuyShort
All the unincorporated places I'm familiar with in the mainland US are under
the jurisdiction of federal, state and county governments, but not a city or
town government. I'm sure there are examples of more sovereign towns... US
Postal Service is federal and required to service the whole country. Lots of
cities and towns manage utilities like power, sanitation, etc. and would
probably exclude unincorporated properties from access. People just buy them
from county governments, private companies, etc.

------
sitkack
John Snow [1], who traced the cholera outbreak to an unsanitary water pump
plotted the occurrences directly on a map. The other map in the article shows
a similar flaw to that of using zipcodes.

Lesson, if you want to correlate something spatially, use spatial coordinates.

[1]
[https://en.wikipedia.org/wiki/John_Snow](https://en.wikipedia.org/wiki/John_Snow)

~~~
cwmma
It's not really so simple, binning is a really useful technique to display
data, especially large amounts of messy data. Just putting all the points on
the map only works if you have a low enough number incidents in a compact
enough area that the individual points make sense. But if you are not careful
your points might actually be a population density map and be masking higher
relative occurrences of something in certain areas.

The key thing is that if your bins (zip codes) don't align well with the
underlining data (municipalities) you can mask patterns. In this case they
basically (to borrow a phrase from gerymandering) cracked the high led
district across a bunch of zip codes.

This is not actually an issue with zip codes nationwide, in MA there is only
one zip code there crosses municipal boundaries so if you were doing the
analysis in Massachusetts you wouldn't have had this issue.

~~~
will_pseudonym
You can easily bin spatial coordinates, too.

~~~
praseodym
Sure, but it still requires some more skill than say, creating an Excel
PivotTable and drilling down on the ZIP column.

------
empath75
A lot of people put far too much faith in algorithms, and assume that because
it's all math, it's both correct and unbiased.

There are so many, many ways to lie with numbers, and once money and power
depend on the outcome of the algorithms, you can never trust that people won't
design them to produce the outcomes they want. I suspect the problem here was
less 'didn't understand geography' and more 'didn't give a shit about the
truth'

------
brightball
Has nobody in the conversation around the Flint crisis paid attention to
Detroit Water's part in the issue? They terminated their contract when Flint
decided it had long term plans to move and that termination forced the issue
of finding an alternative water supply until the long term solution was ready.

If not for that, there never would have been a problem, Flint would have
eventually had its long term new supply without ever needing an intermediate
option.

When I sat down a finally looked over the details of this story it's the one
glaring omission that I noticed from all of the news coverage.

~~~
rhino369
As I understand it, the direct cause of the crisis wasn't that they changed
water. It was they changed water and were totally incompetent in doing it. The
basic PH levels in the new water are easily compensated for. Flint just
didn't.

It's like if an uber driver canceled your pick up and the next one you got was
drunk driving. Sure that first driver was a but-for cause, but not a direct
cause. They have no moral culpability for the reckless actions of the next
person.

edit: edited to correct the ph level as suggest by max.

~~~
maxerickson
The river water has too low of a ph, the correction is to increase it.

~~~
ceejayoz
Yep. And it wouldn't have cost much. Emphasis mine:

> Marc Edwards, a professor at Virginia Tech who has been testing Flint water,
> says treatment could have corrected much of the problem early on — _for as
> little as $100 a day_ — but officials in the city of 100,000 people didn't
> take action.

> "There is no question that if the city had followed the minimum requirements
> under federal law that none of this would have happened," said Edwards, who
> obtained the Muchmore email through a Michigan Freedom of Information Act
> request.

[http://www.nbcnews.com/health/health-news/internal-email-
mic...](http://www.nbcnews.com/health/health-news/internal-email-michigan-
blowing-flint-over-lead-water-n491481)

~~~
irixusr
Interesting. Although I tend to attribute things like this to ignorance and
not malice. < Helps me pretend people aren't so bad.

The people who make decisions often don't know what they're deciding about
(how can the same person make a good decision about child education, policing
standards and public hygiene?) and I think our modern philosophy that a "bad
decision made early is better than no decision" amplifies this.

There are some decisions in life that have to made carefully so we don't ask
for forgiveness.

~~~
maxerickson
The importance of pipe scale and water chemistry is something that the person
making the decision absolutely should have known. It's a huge systemic issue
if the decision has been taken away from someone with that basic technical
knowledge and handed to some administrator.

It really is pretty basic knowledge for someone making decisions about a water
system.

~~~
spacemanmatt
...which is why the trend of conservative-run states usurping power from
cities is so disturbing to me

------
sarcher
Great summation of the issue! ZIP code related issues are a constant problem.
Often the administrators of data sets don't know how to geocode a full
address, and there are few easy ways to have participants record 'good'
geographies. Can you imagine asking people whether they know what Census Block
they live in? So we end up with a lot of data sets where ZIP codes become bad
short-hand for geography.

Last year I worked on a study where the geographic data had been fuzzed prior
to passing it along to the study team, a process that unfortunately resulted
in the patient addresses being simplified to just the ZIP code provided by the
patient.

There were two major issues beyond the general unsuitability of ZIP codes as
geography: first, ZIP codes don't seem to be verified during the course of
patient care. They are recorded (incorrectly at a minimum rate of ~0.2%) as a
part of a patient address during the creation of the initial record. Even if
the address is used for mailings, a bad ZIP does not result in mis-delivered
mail in the vast majority of cases due to the availability of city, state, and
street address data in the full address. The issue is just never discovered
until you are put into the position of relying on only the ZIP code for
geography, at which point the mitigating data (the full address) is
unavailable.

Second, even when ZIP codes do match to Census ZCTAs (the closest thing to
'ZIP code geographies' available and commonly utilized) there are some ZIP
codes that do not have exactly matching ZCTAs. For example, 03911 and 04565
are real ZIP codes without (exact) matching ZCTAs.

It would be preferable if address data, before being obfuscated for data
anonymity reasons, was properly geocoded. After geocoding, the resulting
geography could then be fuzzed for anonymity. Of course, this would just lead
to arguments about what constitutes proper diligence in geocoding :)

------
tyingq
This is an interesting read, but just seems like an obvious problem with no
obvious solution other than "use the right boundaries for your problem space".

He needed geographic search boundaries that matched where the suspect water
was being delivered. It so happens in Flint's case that it matches city
boundaries. I suspect that's not the case everywhere...that is water delivery
in some places might extend beyond city boundaries. Or perhaps some cities are
split, and have more than one water source.

So the overall lesson here doesn't have much to do with zip codes, it has to
do with searching the right boundaries. City boundaries wouldn't have been
appropriate to search for mosquito borne issues. So that's not a good default
either.

~~~
maxerickson
The solution in the article was a simple scatterplot of the data points.

There's no need to use any boundaries when just examining the data. The
basically unbelievable thing that happened is that they only looked at one
summary of the data.

~~~
tyingq
I agree with your last point. A scatter plot, though, could just reflect
population density. There is still value in bounds so you can see relative
figures. I don't see a silver bullet really. The difference here seemed to be
having someone with the right skills.

~~~
maxerickson
Sure, it wouldn't be the end of the analysis or necessarily the thing you'd
use to communicate the data. But here's a paragraph from the article about the
distribution of the data:

 _As I ran the addresses through a precise parcel-level geocoding process and
visually inspected individual blood lead levels, I was immediately struck by
the disparity in the spatial pattern. It was obvious Flint children had become
far more likely than out-county children to experience elevated blood lead
when compared to two years prior._

"It was obvious"...

------
amichal
Frustratingly the default guidance from HHS on preventing de-identification of
PHI uses the first 3 digits of ZIP codes (ZCTAs actually) in their examples
[1]. We've had fun conversations on publishing analyses that rolled up to
other more useful spatial binning because of this.

[1] [https://www.hhs.gov/hipaa/for-
professionals/privacy/special-...](https://www.hhs.gov/hipaa/for-
professionals/privacy/special-topics/de-identification/#zip)

------
numo16
The way the greater Flint area's zip codes and address system works out had
some other interesting impacts, as well, from a business perspective. After
the crisis became more public, we started seeing businesses who were in Flint
(greater Flint area, Flint mailing address), but not the City of Flint (the
affected area), putting up signs and trying to make very obvious that they
didn't have Flint water.

The City of Flint's water system is isolated from the rest of the county,
which falls under the Genesee County Drain Commission's jurisdiction and is
connected to the Detroit water system. It is still difficult to get some
people to understand that they are different things (have had to explain to
family many times that while we live in "Flint", we are not on Flint water).
It would be interesting to see a study done on the effect this had on
businesses both within the city and in the greater Flint area.

------
kensey
"ZIP codes were arbitrarily delineated and covered a range of neighborhood
types; in most small and midsized cities, one ZIP code can cover urban,
suburban and rural neighborhoods with highly variable socioeconomic
characteristics."

My own ZIP code is a perfect example of this. We're in the ZIP code of Town A,
but Town B is actually the closest incorporated town to us. The Town A ZIP
code extends between the state line and a river, up the eastern edge of our
county. That covers our large development, which has people from very poor to
very wealthy, four smallish unincorporated communities that are mostly poor-
to-median, an incorporated town of about 1,000 bordering Town A, and the
actual Town A which is a National Historical Park and consequently is mostly
shops and museums catering to tourists and actually has very few residents
(about 300 out of about 13000 that live in the ZIP code).

------
dsfyu404ed
tl;dr Zip There isn't a good mapping between zip-code and water source so
water source being the common denominator for high lead levels wasn't
recognized, little additional effort was put into combing through the data and
you know the rest of the story.

~~~
foobarchu
The point isn't really about water boundaries, it's a larger about boundaries
in general. Zip code isn't a good mapping to anything except how your mail is
delivered. The water system was just a very good example of how using zip to
do these mappings can fail catastrophically. This is a piece about GIS, not
the particular data involved in telling it.

------
haskal
> guesstimate

Stop making up words for which there already are words. "estimate" works just
fine in that context.

~~~
dang
Please don't be rude in comments here, or take threads on nitpicky tangents.

We detached this subthread from
[https://news.ycombinator.com/item?id=14240636](https://news.ycombinator.com/item?id=14240636)
and marked it off-topic.

