Curious to hear from different industries—what’s the most frustrating or repetitive data task you deal with and how are you solving it?
I do software implementations, and we get customer data exports from legacy systems as CSV or XLSX. Cleaning and mapping them for import is always a pain.
Anyone else constantly structuring, formatting, or fixing data? How do you deal with it, any good tools or workarounds?
Convert the xlsx to csv. Every database system out there has blazing fast import of csv files.
As a sql wizard, I prefer to use sql to clean and re-shape data. So my first goal is to get the data into a sql DB as quickly as possible, no cleaning, no re-shaping. Just a raw dump. Now the data is in my house. I clean and re-shape the data with batch update/insert statements. Finally I batch insert to the target tables.
> what’s the most frustrating
Every import job is a custom scenario. I feel special tools don't give you much. You have to understand both the source and destination data to clean and re-shape it. Tools don't have that understanding. AI is less than worthless. At the end of the day you have to roll up your sleeves and start shaping data.