This, to me, is so neat.
His hobbies make me feel bad about my own...
a cautionary tale from a public servant
Nothing this person is doing is magic, and to the extent that genetics or extensive background prep play a role you could still overcome that deficit enough to at least participate in the space, even if you weren't the most amazing, talented person in the world. It's not like making the world better by addressing weird problems is a particularly competitive field.
I'll make a claim that literally every single skill being shown here is something that you personally could learn, and it probably wouldn't take you more than a couple of months to a year of regularly exercising your creative muscles to get to the point where you were doing something useful.
I would not get that same enjoyment from doing what this person does, although I fully agree i could probably replicate his method. My hobbies reflect what I enjoy doing, as do those of most people. I can feel nominally "bad" that mine are so incredibly worthless comparatively but still know that I wouldn't get as much utility from doing the "better" thing?
If you know that overall the good parts of an activity wouldn't outweigh the bad parts, but you honestly believe that you have a duty to do it anyway, then you should follow your conscience and ignore trying to have a positive life. I don't think that's healthy, but some people disagree with me.
If on the other hand, you know that overall the good parts wouldn't outweigh the bad parts, and you don't think you have a duty to make yourself miserable in your free time, then what the heck do you feel bad about? If your guilt is real, you should acknowledge and take steps to address the cause. If your guilt is not real, then get a rubber band or something and snap it against your wrist every time you feel guilty about comparing yourself to other people.
You can replicate this if you think it would be worthwhile to do so. If you don't think it would be worthwhile to do so, then you shouldn't feel bad about it and it's just your stupid tribal lizard brain that's making you feel that way. And if you do think it would be worthwhile, then do it. I mean, even video games have bad parts. Having bad parts doesn't make a hobby unsatisfying or unenjoyable by default.
The third category in all of this is someone who thinks that overall the good parts of this kind of work would be incredibly satisfying, but they're so scared of the bad parts that they can't start engaging, or they're so scared of being inadequate that they never try to learn or develop their skills. My comment was addressed to that person. If you're not in that category then, I dunno. Flip your "feel better" switch.
I think I could do this, from my starting point it wouldn't take me very long to do something useful. But it looks like the meat of this project involved corresponding with a city bureaucracy and cleaning up ugly data to get basic info from it. It sounds awful.
So... if a hobby is just something optional you choose to do and don't make a living at, then yes, I could do this as a hobby. But in the sense where hobbies are things you do for recreation - reading, gaming, woodworking, gardening, etc? This would be far less pleasant than any other hobby I have, and the bureaucracy parts would be actively negative.
I'm glad this happened, and I'm glad if people enjoy doing this sort of thing! But I think when we talk about having productive hobbies, it's worth differentiating "tasks you can achieve" from "tasks you can seek out sustainably without hurting your quality of life". I have productive things in both categories, but I'd only describe the second category as hobbies.
No amount of practice is going to teach me how to see that these problems exist in the first place.
No amount of practice is going to teach me that government types might actually be willing to get rid of their revenue generators.
It takes a special type of person who sees what the actual problem is to begin with, then figures out how to get the data into a format where it all actually makes sense, and then takes the next step of contacting the appropriate authorities to get the problem fixed.
Don't get me wrong, I do understand and appreciate that some people might be able to repeat the same type of success, if only they had more training and practice. But you have to be careful about how you "encourage" that kind of thing.
Any reasonable person would make such an assumption without a second thought.
I'm still baffled that the city took any action whatsoever in response to this guy's request. How could he have known that they would just roll over and cough up $60k in lost revenue? It's seriously out-of-character for them.
Governments are of the people so long as the people maintain them as such.
When you treat a government like a business, it falls behind on it’s maintenance schedule, and begins to resemble one. There are a lot of people who wish the government was their business, and will encourage you to play along. These people work night and day to lower your expectations of what you’re capable of.
Stop falling for it!
The USA had some ingenious (and flawed) founders who set in place some rights and traditions that reserve at least a small finger hold which resembles democracy, by which the people can mobilize effectively.
But, in any nation, that fingerhold can exist when enough people come together en masse.
> Governments are of the people so long as the people maintain them as such.
"We the people" decided that government has some set of tasks to perform. Those tasks require resources, so "we the people" decided that we should steal some portion of each others' earnings, in order to provide a means that these tasks be accomplished.
The continued legitimacy of this ... arrangement ... is due to Social Contract Theory.
If, for discussion's sake, we temporarily step into the average-person's shoes and accept the Social Contract Theory at face-value, then it is most simple to conclude that what we are really doing is paying the government a lump sum in exchange for some set of services, and by garrulously quarrelling and advertising to each other we can decide who pays how much, and what is the set of services, and to a limited extent how the services ought be provided; under the limitation that if all the lump sums can't pay for all the services, then the difference will have to be made-up by printing more money and thereby reducing everyone's purchasing power (in short: further theft-from-all).
From that we can conclude that the "best" government would be the one which satisfied the majority of the desires of the majority of the people while appearing to take, in return, as little as possible. Does that not sound like the goal of a business?
> The USA had some ingenious (and flawed) founders who set in place some rights and traditions that reserve at least a small finger hold which resembles democracy, by which the people can mobilize effectively.
Problem is, that kind of mobilization will cause enough chaos that it's not worth doing it over just a parking-ticket racket.
No, it will likely require some single, unambiguous, flagrant, overt, unapologetic and high-stakes treason against the letter of our principles and procedures ... before the 2nd Amendment's most fundamental purpose is put-to-action.
And I have no faith that the outcome of such an event, will be nontrivially better than the Articles of Confederation; it probably won't even outshine the Constitution of 1788.
But this whole discussion is going far afield enough that my betting-money says we'll soon hear from Fearless Leader Ang...
It's not difficult, just costly.
Sadly public salaries and jobs in general are low hanging fruit targeted by short-sighted political campaigns while all other gov spending tends to be a black hole with zero measurable ROI and countless professional public grant/gov money consumers well aware of the lack of measurable ROI and crony/who-you-know-in-gov nature of spending. Or worse the phony claims of adopting 'private industry' to side step accountability via public-private arrangements which feature none of the benefits of markets (true competition, state anointed monopolies, market dominance disconnected from value provided to consumers, etc).
Most of which could be blanketly solved by hiring good people (the people who dispense and use the money) and not creating quid-pro-quo incentive systems by underpaying public servants.
1. Pay for all the below by strategically cutting military spending some low-to-mid single-digit percentage.
2. Eliminate public sector unions and most Civil Service classifications, both of which make it harder to fire low performers.
3. Peg government payroll for all positions to (if it exists) their private-sector counterpart, plus a healthy percentage, say 15%.
4. Eliminate government pensions and match government employee 401(k) contributions up to a large percentage, e.g. 10% of salary.
5. Allow all government employees to enroll in Medicare.
 The military is already getting projects it doesn't want as pork, most of these cuts could be to those programs.
 There are legal caps to salary for the executive branch but there are already ways around those (e.g. as contractors).
I do agree that government jobs should be more competitive in the job market, just with caution.
You've identified the problem.
They really wanted me in that job, too. But then the state froze hiring and several months later the posting expired. It hasn't been re-listed since then and it's been nearly a year now.
I've made an effort to work for local governments. I've even made an effort to volunteer for local governments, and do this sort of technology work directly for them instead of as a FOIA-enabled personal project.
I've never seen a flicker of interest, or found a job posting that would leave any room for this sort of work. I've only ever seen indifferent hostility to the volunteering offers, and while I understand why that could be worse - bureaucratically - than paying staff, it's still not exactly systems making an effort to serve the people.
The US Digital Service was a brilliant and wonderful project to get technologists doing exactly this. It hired a lineup of top-notch staff and got a lot of great stuff done. And, yes, it paid government salaries and appealed to civic duty to recruit. It sounds like a wonderful place to work. Outside of that one national-level pet project from Obama? I mean, I got involved with a local technology/privacy group. They're currently considering suing the town for not following its own surveillance-restricting ordinances about street-facing cameras, because the town found implementing them too hard - and isn't hiring anyone who can, and wouldn't accept volunteer labor to do the work.
It's not just that advertising pays better, it's that advertising doesn't actively avoid working with people for this sort of task.
It's "not speeding" /savedyouaclick
No, because they purposely fine you when you are a few kilometers about an arbitrary limit (let's say 55 km/h instead of 50 km/h) which is not speeding in any way and there is no data supporting any kind of increase of accidents at such levels of speeds.
On top of they they use all the dirty tricks in the books (mobile radars, radars right at the exit of a tunnel) which act like traps for anyone that is not constantly vigilant at their current speed. Let's face it, nobody is spending 100% of their attention on the speedometer while driving.
And when "normal" people around you get fined while you know for a fact they are not driving like crazy folks on the road, something is really, really wrong.
In the blog post he estimated it's saved $60,000 in fines. That's a drop in the bucket at the scale we're talking about (a quick Google says the proposed 2018 budget for Chicago was $10.1 billion), but still a decrease. Not only would it cost the government money to employ someone to go through this data and find hotspots like this, someone to go out and evaluate the signs, presumably several someones during the approval process to change signage, and finally people to make and install that signage, but the end result would be to purely cost them more money by decreasing revenue from fines.
I don't personally think governments are inherently evil (though some do seem to try harder than others), but even from a purely capitalistic viewpoint that's a hard sell for anyone who cares about their budget. At the very very best I could see it becoming a token effort that's mostly marketing ("look, we're using big data to make your life better!").
My suggestion: Remove the incentive by divorcing all fine (and similar things like seized goods) revenue from the government budget. Perhaps stipulate that it gets distributed to charities, or is split equally among taxpayers as a tax offset.
No, the government should be "THE people", not "for" or "by". We introduce many issues with representation of larger groups by very few individuals, that is precisely why democracy/government works best at a local scale rather than State or Nation-wide. It's not very hard to grasp why.
I’m sorry for the confusion, I thought that was blatantly self-obvious.
And we can actually estimate how much the government actually saves if it does not need to fine (that much). Let‘s first look at the costs of the status quo:
- it costs X to check rules are followed (here: no cars park where they must not); this is mostly personnel costs\* but note that we may be talking about „manhours spent that could have been spent doing more sensitive/productive things“
- it costs Y to maintain the infrastructure to process and follow up on the fines (here: the $190 million IBM contract\* )
- it costs Z to collect the fines, process the payments (personnel costs), follow up on those that do not pay, court fees (process & personnel costs), jail costs (cause it‘s ’murica), and whatnot.
Now, if there are no\* fines to prosecute, here‘s a few ways the state can gain money:
- Citizens spend less time and money with unproductive work (here: paying fines), leaving more time for work (or relaxation which again increases productivity) and money to spend (raising economic output + sales tax).
- Officers, beuracrats, and judges can spend their time dealing with more important work.
- E.g. in the case of parking fines, businesses affected by cars parking in their way can instead do their business unhampered, i.e. be productive and this increases, again, economic output.
All of that increases tax returns (or reduces tax money spent for dealing with the fines).
\* Obviously, there‘s no way fines will drop to zero, because humans. But minimizing fines allows to minimize capital and infrastructure costs and increases economical output, thus there‘s a net gain that can offset the costs to reach that goal, if not immediately then within a few years.
Governments absolutely should be looking to make profits, it's just that they don't necessarily have to make them in dollars.
So say you were able to objectively measure the value a program produced (costs are usually already known). If a program costs 10 units for every unit of value it produces, maybe it isn't a good program. If it produces modestly more value than it costs, it's making society a profit.
I understand what you're saying, but I think you're stretching the common understanding of "profits" and risking confusion because of it.
Absolutely, government should try to measure the impact of its actions, but expressing that in terms like profit can lead to undesirable consequences like the expectation that a successful self-promoter pretending to be a successful businessman can also be a successful president.
In this case, the dogmatic assumption that every endeavour should be financially profitable.
I was in no way implying the government was trying to make money off increased fines (quite the opposite with the last paragraph), simply that it would very likely end up costing them more than it saved the taxpayers to support such an initiative at a larger scale, and that would have a very nebulous gain.
Can you imagine being the Mayor of your department at work and proposing to your board of directors a multi-million dollar budget for next year that includes a huge carve out for evaluating all the petty fines and late fees you collected from customers because it’ll make them happier?
Goodwill is one thing, but who is ever going to approve that?
You've stumbled upon another capitalist dogma: that if you can't measure it (like, for instance in this case, the happiness of your road users), it has no value. I would argue that his numbers show that his efforts have prevented 600 fits of rage among the citizenry. That's gotta be worth something, right?
Can you imagine [spending money] because it’ll make [people] happier?
Again, government is not a business and should not be run as one. So yes, I can.
My city has a thing called a target area. The idea being to spiff a region up, and rotate them to keep the city vital and livable overall.
They send a facillitator, who gathers interested people. Some projects get identified and everyone does their part. The people in need of city resources or people get an introduction and help navigating things.
Everyone else plays a role. Labor, outside (not city budget) fundraising, planning, feed the people, organizing, whatever.
My group used it's few years well. Traffic flow changes, a small park made from abandoned property, refurbish the school play areas and equipment.
It took a bit of time and some sweat, but not too much.
And I can drive through today and see that net good.
Some of the people I worked with did exactly what you just said, and for basic, make it better, reasons.
Looking into this more, it sounds like, while there probably is a perverse incentive because more tickets equals a significant increase in revenue, the biggest problem is the punitive fines for minor non-traffic offenses that tend to compound for poorer residents: https://www.motherjones.com/crime-justice/2018/02/how-does-c.... Note the headline is a little misleading, they seem to mean "non-moving violations".
Democracy requires constant vigilance in many forms. This is one such form.
Matt, we salute you and your efforts! I hope this encourages others to get involved in improving their local government (and perhaps even creates reusable tooling for use at scale [“citizen oversight as code”]).
They could be. And having a beer or two with govt people will reveal these kinds of desires and ideas.
I have had those chats in the past. Got involved in a legislative effort and was given a sort of insiders view, tour.
What gets in the way, the number one thing, is money. Not lack of it so much as priorities and ripple effects.
Fixing the signs is a net public good. The ripple effect might be revenue targets going down, and the priority being that revenue being made from inane parking tickets all make for a bit of a mess.
The number two is people forgetting or ignoring who works for who and why. There are lots of little fiefdoms, all closely guarded. Barriers where there really should be collaboration.
And on that note, collaboration can be expensive. Sure, we can step out of your way on this, but about that school levy...
Remember, you're part of the government.
This is a really important idea, and I don't think I've seen it expressed so clearly before. Thank you.
Interesting note about getting data like this - Illinois FOIA allows a requester to submit a SQL as part of their request.. so long as they know the tables and columns within the database ;)
Not only did you save drivers $$$thousands, but you freed up at least one of 'Chicago's Finest' to fight more serious crime.
The ones that didn't send me a rejection letter all told me that the cost would be somewhere between $400 and $2,000 to complete the work. While it's nice, and I've spent similar on requests before, it's not something I'm exactly interested in doing repeatedly, since this is all coming from my personal savings.
That said... I'm still fighting that fight in Chicago. Chicago used to post a "data dictionary" of 111 of their databases, which had a list of the tables and columns. Problem is... it isn't running anymore since it's no longer being funded by a $300k (!) grant. So, I'm working on requesting a database dump of that. Last request for it was rejected saying that Chicago's Department of Innovation and Technology (DoIT) doesn't mave any of those records. I'm thinking it's because I requested a "copy" of the database dump, rather than just explicitly asking for the data within it.
That opens up a pretty nifty door to work with for future requests ;).
Mind you the company responsible probably said "the database schema will cost $2k to put in a PDF".
Of course the former is pretty worthless, especially if the state doesn't outline any formal appeals process - you'd have to actually sue them for the records, which is rarely going to end up being cheaper. It also doesn't mean that the actual cost couldn't end up being insane. If no one has access to perform that particular query or knows how to get that data it could involve requesting a custom report from whomever developed and/or maintains the system. No one is reasonably going to pay IBM's hourly rate to write a report for them...
I work for a company that has a lot of state consultants (>50% of the workforce) and our salaries are typically 2-2.5x our state counterparts, and most items of work that would take more than an hour or two will require at least one level of management approval on both sides.
The other's through Muckrock, for specific requests:
https://www.muckrock.com/project/tables-and-columns-from-gov... (I'll add more requests to this list tomorrow)
Perhaps we need a meta-site with best practices and how best to attack open records requests in general.
(Not all of my post came through)
Great example of value being in data. One that ordinary people can connect the dots on and encourage.
That's some small-government activism I can get behind!
Under Florida public record law, source code produced by state employees is, in very narrow circumstances, a non-exempt public record (the code can't process sensitive data, etc.). I'm considering a future endeavor where I periodically request the code to such projects until the I.T. department decides it's worth the effort to open source it.
I like to think this is a step towards consolidating publicly funded code and reducing duplicate effort. Ahh, imagine making a pull request to your city's website! But I'm getting ahead of myself...
I have a lot of experience in making public records requests and would be happy to help.
I thought so to until recently and was honestly kind of surprised they actually gave it to me. They rejected giving license plate info at first, but they've given it out in other, similar, FOIA requests.
Specifically in FOIA's statute, it says:
(c-5) "Private information" means unique identifiers, including a person's social security number, driver's license number, employee identification number, biometric identifiers, personal financial information, passwords or other access codes, medical records, home or personal telephone numbers, and personal email addresses. Private information also includes home address and personal license plates, *except as otherwise provided by law or when compiled without possibility of attribution to any person.*
In some cases that is a very good thing. In other cases it’s just them trying to obfuscate and block transparency.
(random example: https://www.syracuse.com/news/index.ssf/2015/01/private_comp... )
Think of a license plate as an 'address' for a car.
[Edit: on the other hand, if the ticket is unfair (eg. confusing signage as in this example), then you have a valid point; I just wanted to point out the other side of the coin]
Also, this is completely missing the point here:
> Don't want your name tarnished? Don't park illegally.
It's not about reputation, it's about privacy -- and safety.
"Criminal" means in the criminal code, but both are illegal. I think that you don't have a right to privacy for either because it obfuscates the application of the law. Indeed, the Japanese government can query the Ontario government to get a list of transgressions that you had while driving a car in Ontario (I know this because they did so when I converted my driver's license to a Japanese one -- and they didn't need my consent).
I think OP's use of the term "criminal" is a bit loose, but I would be surprised if you have any right to privacy for a a fine levied due to a legal infraction. Whether or not you should have a right to privacy is a completely different conversation...
Aside: It was important to me because many years ago I inadvertently drove while suspended. I had an unpaid ticket that I had forgotten about and my license was suspended. The suspension got lost in the mail (first a postal strike and then the delivery person put my mail in the wrong "super box" -- I eventually got it months later). When I was first getting my visa for Japan, I needed to find out if this was a criminal offence or a highway traffic act offence.
Today, a lot of that information is at least a lot more accessible to everyone (though in this case it still took a lot of work) and, furthermore, it can be mashed up with other public or semi-public data.
I'm pretty sure this is something we'll be collectively be coming to terms with for a long time.
But that doesn't change the basic point. If you were to ask if you should be able to look up anyone's physical address by typing their name into a web page, I suspect many people would say no. Yet, here we are.
(I also suspect that many would be really shocked at the amount of info available about them from "deep web" searches much less via a $20 online background check.
Near where I work in Bellevue WA, they recently restriped the road to have a brightly painted bike lane, with double-white lines to make it abundantly clear that you were not supposed to drive in it. Bright red "no stopping" signs were placed on the curb. People still parked right in the bike lane.
It wasn't until they added a concrete barrier that the lane cleared up enough that bikes could use it. And of course, right where the barrier ends, people start parking there instead. The West side seems to have less difficulty understanding this.
We still have the unmarked white vans over here. I think it's Amazon.
Or use an e-bike so you don't have to pedal as hard.
I live in the Boston area and I see a ton of people in business attire riding bikes.
That explains. Contrast the climates of Georgia or Florida.
During the summers here, just standing outdooors in the sunlight wearing anything heavier than beachwear, is inadvisable at best.
Humidity would make this more difficult. Austin's not as humid as Houston, but it's not a desert either.
On the ride home I change back into my riding clothes and spray sunscreen. Just keep hydrated and you will be fine.
You'll adapt in more than one way. You'll get in better shape, so you won't sweat as much or be as tired by the cycling (or tired at all). More importantly, your attitude towards the discomfort will change. Yes, it takes effort and can be uncomfortable, but it's all perfectly acceptable after a while. Seems like a good example of the hedonic treadmill.
I already run about 25 miles per week, but thanks for your concern.
The thing is, I get up at the crack of day so as to do it in coldest weather, I wear appropriate athletic attire, I'm unladen by luggage, and if I feel too tired or something goes wrong then I lose a workout instead of my job.
Takes 90 minutes to do what my car can do in under 30, and I don't want to wake up yet another hour earlier.
Doesn't run to the area where I live; and would have to run even farther to reach the office. I suspect the NIMBYs want it like that.
I'm a miserly bachelor in an office full of respectable familypeople. None of them would dare to live within 2 zipcodes of where I live.
Is shockingly expensive to use for daily commuting, moreso than financing, fueling and maintaining my own car. Plus, unless rush-hour lasts long enough, each cab will only get one commute ride per rush-hour, so there's no net decrease in traffic for commuting via cab.
At least, not without ... martial-arts training? Or maybe a "penniless grad-student" disguise?
I can recall a discussion with someone on here about how parking in the bike lane in San Francisco is usually caused by a lack of options. That might be true, but I think most US cities are a lot closer to Austin than SF.
Hundreds of thousands of records a month. I ended up importing them into Excel(1) and then using... what was that called? An MS/Windows library that came with IE 5 and/or a few other things, that provided regex support (with a few quirks) that was accessible via VBA.
The point was, I could programmatically mine it -- including regex pattern matching and replacement of and within cell contents -- while also having a flexible UI within which to find and handle one-off cases. When the one-off's demonstrated a repeating pattern, I could quickly iterate to add that to the programmatic mining logic.
This included adding color cueing for items of particular interest, manual follow-up. Excel's sorting capabilities to bring potentially related instances into visually displayed groups. And the like.
It ended up working quite well. I might have preferred something else to VBA, and I did use Perl and other stuff, elsewhere (something that also gave me both power and the flexibility to rapidly iterate).
But the point is, with such data, I found it very useful to combine regex and rapid programmatic manipulation, together with a good visual interface (including visual cues, the ability to comment upon instances -- Excel cell-level comments -- etc.) and manual manipulation.
As a final aside, the extensive set of Excel keyboard shortcuts greatly aided in rapidly and effectively navigating and massaging the imported data.
1. This was back when Excel had... I think it was a 64K (or a bit less) limit on the number of rows in a sheet.
P.S. I tended to retain the originally imported data in its columns, and to produce my mining of it in other columns. That way, I could always and immediately see what I started with, for any particular record. (And, if things visually started to be "too many columns", well, Excel lets you hide a range of columns from the view. As one example of how its features really helped, on the visual front while doing this work.)
I still had to learn and allow for some quirks Excel exhibited with respect to importing text data. That included making sure the cells/columns being imported into carried the correct/needed formatting designation before importing into them (usually, "Text").
I'm surprised he asked after license plates, though. I don't know if that is different in the USA, but in Europe that certainly wouldn't fly because of privacy. I wouldn't even have asked because I shouldn't want to have such data. Perhaps one could get an anonymized version to be able to correlate how often a certain plate got a ticket, but not which plate that was. Anyway, the general concept of a FOIA request is the same. (Edit: Oh, someone else remarked this as well: https://news.ycombinator.com/item?id=17754396)
If it is annually, they got 17m tickets over 7 years so for 10 years, assuming they issue just over 19m tickets, that means each parking ticket needs to be at least $10 to cover the cost, even at $100 per ticket, IBM is banking on 10% share? That seems excessive to me but I never worked in government so could someone enlighten me on this?
By any chance there's a conflict of interest for government to be willing to make improvement and cut down parking tickets or any other similar source of income? Or maybe that's what public audit is for?
I wrote a blog post about it, because it requires a ton of work to get FOIA requested data - this I'm assuming was done in the same painstaking way:
I give this props. I'm sure it required a ton of work
Did you give more thought into the address cleaning bit? Or does anyone have an idea how to go about transforming mangled addresses into coordinates?
I have a problem that's been bothering me for months, similar to what you have here: people from an emergency service call-center are inputting the addresses of the emergencies. For emergencies that happen on the public domain, there is often not a specific address, but rather names of landmarks. Something like "Street StreetName / Opposite Train Station Y", which can be written like "st stName / opp tr st y" or some other infinite variations.
I don't have any after-data to corroborate, but I do have previous instances where the operator inputted the same address better. If I can extract the correct landmarks, I think I can do a Google Places search for them, with a cleaned query, like "Store Amazon, Best Street, Ohio" to get coordinates that can fall into an acceptable area.
PS: in the example you gave with Lake Shore Drive, I think you could easily correct the names with an algorithm based on the Levenshtein distance
My current stack is:
1. Send addresses to https://smartystreets.com/ - They gave me a year's worth of unlimited geocoding for free. They also tokenize the addresses, but I had about a 50% success rate with them.
2. Tokenization raw addresses with https://github.com/datamade/usaddress.
3. Use a normalized levenstein distance algo to get ratio of difference.
4. Compare all of the addresses' levenstein distances with each other.
5. Apply logistical regression/gradient ascent algo to tickets by chaining heavilytypo'd addresses to less-typo'd and eventually to a static list of verified-correct addresses.
It works surprisingly well, but there are still a lot of problems that can't easily be solved:
1. Street types (st/ave/blvd/etc) are missing. So, when two addresses have the same street name, it's difficult to pair the two. It's still possible with some probability stuffs and matching the ticketers' paths to the nearest street.
2. Addresses have a LOT of one-off situations. For example, there's a street name called "Avenue A". The street name here is "Avenue", and the street type (usually st/ave/etc) is "A".
3. Lots of four letter streets make levenstein distance very difficult.
Glad you enjoyed it!
I already have some preliminary data - in a city with 350k inhabitants, they gave 150k fines last year, totaling 2.5 mil EUR. I can't wait to search for the hotspots
It's a C library with gigabytes of data, so it isn't light weight, but it attacks the problem aggressively.
I guess I just learned I half expected each person who wanted FOIA data to have to request it themselves, for their own personal use.
In this case I can see reusing this for interesting reasons (the plates in the .txt.gz have not been removed), so...
The footer indicates that the web page was generated using bashblog  – looks like it might be worth checking out.
You're correct that a simple DB with some forms would be cheap.
But integration tends to be crazy expense. For this sort of system, other things that also need to be covered:
1. Billing integration. Including changes to billing codes, bill (fine) printing, testing.
2. Audit integration. Because whenever money is handled, audit follows.
3. Customer support integration. Including UI for customer service, training, testing. This is often a very complex item because customer service already have a zillion systems they have to use and their training requirements are ongoing and expensive, so they want you to integrate with their existing systems instead of giving them a brand new thing, and integrate with their existing training processes, etc etc.
4. Integrate with all those hand-held readers. inc vendor compliance, testing etc.
5. Contract management. You have a contract with the government and they'd like to know that you did what you claim you did. So there's teams of people to deal with on an ongoing basis.
6. Project management. There's more than one person working on this, and a lot of complex integration requiring changes in other systems => extensive project management.
7. Ongoing changes to requirements, often conflicting. All the integration points above are moving targets, so expect that they'll have to be re-done a few times both before and after launch.
8. Arse covering. You now have a large contract with the US Government. You will sued and they will get sued (typically by whomever didn't win the contract). Vast amounts of documentation covering _everything_, including documenting the process by which documents are written => tech writers galore, plus lawyers plus lawyers.
Honestly, this is barely scratching the surface. I haven't even touched the (expensive) work before the contract is even signed.
$190M doesn't go very far!
This is so impossible to optimize for :(
The really sad thing is that it's not humanly possible for one, or two or even just five people to do all of this, unless they were all like 19 or something, and then not for very long.
You literally do need all those departments that have the applied knowledge to do what they do well. Voila, $200M.
On a technical basis, it's trivial - you already have the data stream that's going to be sent to the printer, generating a PDF wasn't going to be an enormous roadblock (though it wouldn't have been completely trivial as the source data was PCL not PS - did you know that there was handling for that in Ghostscript, at least on the commercially-licensed side?). Encryption of PDFs also possible, either with separately-licensed open source tools or with some closed-source commercial alternatives. Even ignoring the possibility of email being intercepted in transit, encryption would have been a requirement due to the risk of someone walking up to an unattended desk and simply checking that attached PDF for someone's pay info.
The killer? The infrastructure required to assign and allow people to change their passwords including management, training, etc.. By the time you've built that, you're a chunk of the way to simply providing the payroll information within an online HR system instead.
Like the old trope about the first man on Mars being a technician for an unreliable rover, the bulk of the work and cost isn't always where you'd think it would be.
All of the entry level positions are above 80/hr.
The vendor is covering insurance, holiday/sick leave et al, payroll taxes et al, rent, management, carrying the risk of the work going away, etc etc.
It looks high, but it's not nearly as profitable as it seems for either the vendor or the employee.
The likely reason there are many tickets there is that there are many bars there, and great crowds of people who have had a bit too much to drink. There are also great crowds of cops there every weekend.
Without looking at the data, I'd expect that many of the tickets are getting written in the middle of the night, when people are too inebriated, or too distracted, to read signs carefully.
Not saying the signage was clear, but that is a very very weak excuse to not understand them.
It's not a coincidence that that huge number of tickets is being written in a busy bar area.
"This'll be my first blog post on the internet, ever. Hopefully it's interesting and accurate. Please point out any mistakes if you see any!"
KEEP IT UP MATT! and data munging, not sure if it's a word, but it sounds nice :D
Though, during last mayoral election, some of the mayoral candidates wanted to use parking tickets as part of their campaign, and through some connections I found my way into Bob Fioretti's campaign manager's office to discuss parking tickets, alongside an ex-candidate, Amara Enyia's campaign manager. They were super, super interested - Fioretti's CM calling the work "fucking golden". But.. they both went silent after that, despite Fioretti started using parking tickets as a major part of his campaign. Go figure.
There's a lot more to that story - I'll end up write about sometime later :).
Revenue from parking tickets is easy money for a violation that is generally harmless.
edit: my bad, didn't see it at the end
There's a few good explanations there.
Sure, this may seem a bit "evil", and the better solution is reducing negative externalities through taxing, which is a more transparent and ethical solution, but most of us don't have that level of power and influence within local councils and governments.
Don't punish cars, promote electric cars, scooters, bikes, and other ways of getting around. Dedicate more parking to electric-only spots (with chargers), bike racks and bike share docks, and scooter parks.