
Ask HN: How to easily use data from similar CSVs in Python - ivar_awesome
I have several CSV files that are similar, although different in more or less subtle ways. For instance, the columns may be rearranged or have different names. The files generally contain the same information, but are from different sources - hence the differences in formatting.<p>Preferably i would want to keep them in a directory on my computer, and use (or build) a tool to access the data from python modules in a generic way.<p>My solution so far is using an SQLite database and a script that tries to work out how to rearrange the data to fit the schema. This seems like it could work fine if the integration script is smart enough, but it has devolved into a messy case-by-case system, and  is proving difficult to make generic.<p>Have any of you tried to do similar work, and are willing to share some insight?
======
brudgers
This might be easier to do with AWK because AWK was designed for this type of
task. The time spent learning AWK and getting exactly what you want stands a
good chance of being less than the time spent learning a library and beating
it into submission. In part because AWK has decades of development and
documentation.

Good luck.

------
senthilnayagam
you can explore two tools, csvkit and xsv, it can help you view the headers,
rearrange, sort, merge, join etc.

csvkit is python is good, but xsv is written in rust and a available as
compiled binary and is around 10x faster when working with large csv files.

~~~
ivar_awesome
xsv looks like exactly what i need, thank you!

