

Who's working on IT plumbing? - smithbits

I work in IT.  The company I work for makes things and sells them to people, so my job isn't about writing complex software, my job is about making it easier to make things and sell them.  For the record I work with a great group of people and really like this job. And because I'm not new to this I've seen the same problems over and over.  A simple example is that customer lists always have junk data in them.  I've never seen a list of more than 10K names that I couldn't find problems in with some basic SQL queries.  Given that this happens everywhere who's blogging about the solutions?  Things like I want a database that has a type called "FirstName" and when I do a query like:<p>select FirstName from Customers where FirstName like 'Alex';<p>I want to see "Alex", "Alexander" and "Alejandro" in the results list.  Now for this specific problem I realize there are many interesting answers outside the database.  I goofed off with python difflib one Friday afternoon and read up on Levenshtein distance and made a crude audit tool.  But who out there is actually <i>solving</i> these problems or at least writing about them in new and interesting ways?  Thanks.
======
GFischer
About your query: Oracle has the "soundex" function but it won't return
exactly that.

<http://en.wikipedia.org/wiki/Soundex>

The solutions we have in place here in Uruguay are far more simple: a
nationwide ID (issued by DNIC
<http://www.minterior.gub.uy/webs/dnic/index.htm>) and the local Equifax
branch both have unique data on a customer (I used to work for said Equifax
branch).

No idea about a blog, sadly.

