Really love the idea, and the passion behind it. Def could have legs.
Here’s the pitfalls I see you falling into:
(1) seriously, what data are you collecting? “Everything” isn’t a great answer (who’s supposed to use ‘everything’, anyway? “Anyone”?). “Apples-to-apples police misconduct statistics” is a good one.
(2) it’s important to clarify 1 because you need to know who you’re serving, and why. Different activists need different data. “Have all data” sounds good until you need to decide how to allocate your resources.
(3) more deeply, data is the land of edge cases. Even just with police misconduct, you need to get DEEP to rigorously compare seemingly-simple stats like “# of unjustified police killings”. If you don’t start narrow, you’ll never show value. If you don’t show value, nobody will ever care you exist.
When I look at the data you’ve collected, it ranges from annual reports, to municipal contact info, to crime stats. What’s important to collect at scale? To whom? What do they need it for?
Again - great, ambitious idea! But $250k goes fast. Show value before it runs out!
Thanks for the thoughtful response! This is really helpful.
1. Agreed. Our strategy for this isn’t clear on the website, I guess, but we do have one. It’s to focus on depth in geographic areas. This is because context is critical, and because most of the users we talk to are operating locally with municipal or county level data. So it’s more important to have every data source we can possibly find relevant to Pittsburgh than it is to have every arrest record in every municipality. Or at least, it’s more immediately useful to people.
That said, most people seem to contribute data sources from where they live. I think little microcosms will spring up where people take stewardship of maintaining information about their chosen geo or subject areas. Not too far down the roadmap, Milestone 2 for the PDAP heads.
2. I will take it as a next step to make this strategy clear and say why. We want to basically allow the community to make its own to do list: what kind of question are you trying to answer? That creates a “bounty” for data which can be fulfilled by an altruistic volunteer, another member of your team, etc.
3. Yes. We’re not trying to do apples to apples comparisons of departments yet, partly because it’s so absurdly difficult and you don’t know where to start. Why would you undertake a 12 hour research project to compare St. Louis and Minneapolis incident reports if you don’t have a use case? Instead we’re focusing on what we DO know we need: complete local data, town by town / county by county.
The data we collected reflects the nature of our early experiments, which were scattered. This airtable prototype is maybe 2 weeks old, next up is helping people understand where to focus.
The idea for demonstrating value is also local. I’m working with groups in Pittsburgh (where we are based, and where our funding came from) to make ourselves indispensable to them. I’m hoping to turn the $250k into a handful of killer local case studies in this year, rather than marking 0.1% progress toward a national vision.
Thanks again for giving me the practice explaining this stuff. I hope I’m making any kind of sense, and of course happy to hear where I’m still wrong.
For number 1, I would look for scenarios where rhe officer was found to have committed misconduct or found to be unreliable. Then watch if they're involved in subsequent cases/departments when should probably never work as an officer again. Just my thoughts on one thing that could be done.
Uhh like how many murder cases got solved, how many drunk drivers had their driving licenses revoked, how many speeding tickets went through, how many stolen cars and goods found its original owner, how many pickpocket thieves, shoplifters, bulglar cases got investigated, you know ... Stuff police exist for?
Those stats are typically at the department level. They don't necessarily match the court records either (eg the police may mark a murder case as solved even if the prosecutor thinks the case is too weak and never goes to court).
Here’s the pitfalls I see you falling into:
(1) seriously, what data are you collecting? “Everything” isn’t a great answer (who’s supposed to use ‘everything’, anyway? “Anyone”?). “Apples-to-apples police misconduct statistics” is a good one.
(2) it’s important to clarify 1 because you need to know who you’re serving, and why. Different activists need different data. “Have all data” sounds good until you need to decide how to allocate your resources.
(3) more deeply, data is the land of edge cases. Even just with police misconduct, you need to get DEEP to rigorously compare seemingly-simple stats like “# of unjustified police killings”. If you don’t start narrow, you’ll never show value. If you don’t show value, nobody will ever care you exist.
When I look at the data you’ve collected, it ranges from annual reports, to municipal contact info, to crime stats. What’s important to collect at scale? To whom? What do they need it for?
Again - great, ambitious idea! But $250k goes fast. Show value before it runs out!