
Show HN: Convert CSV files into MySQL insert scripts - d4nt
http://database-import-scriptor.appspot.com/
======
pi-rat
Or just do something like:

    
    
      LOAD DATA INFILE 'data.csv' INTO TABLE mytable
        FIELDS TERMINATED BY ',' ENCLOSED BY '"'
        LINES TERMINATED BY '\r\n'
        IGNORE 1 LINES;

~~~
akx
If you're fortunate enough to be able to put `data.csv` on the MySQL server,
that is.

~~~
masklinn
Add LOCAL (as in, "LOAD DATA LOCAL INFILE") and the path is looked up in the
client's filesystem, with the file uploaded through the connection before the
command is run.

~~~
akx
Cool. I swear that didn't exist the last time I looked at the docs for LOAD
DATA INFILE...

------
sarciszewski
In my experience, this is actually not that difficult or novel, but kudos on
trying new things. Keep it up :)

~~~
clxl
I'm surprised by how many things that aren't that difficult or novel end up on
the front page. CSV related projects come to the fore every week.

~~~
sarciszewski
Heh, maybe I should start submitting some of my more trivial projects and rack
up the HN karma ;)

i.e. "A testing utility for database migrations"

------
mathnode
I am not putting 1TB+ per table dumps of data into a strangers website what
can be done with a bit of sed or a script in ${LANGUAGE}.

Loader from Several Nines, is a tool for doing parallel loads of CSV into
MySQL and MariaDB
[https://github.com/severalnines/loader](https://github.com/severalnines/loader)

Mydumper is a threaded dump and load tool for MySQL/MariaDB.
[https://launchpad.net/mydumper](https://launchpad.net/mydumper)

Tungsten Replicate can do batch load replication into any external datasource
[http://docs.continuent.com/tungsten-
replicator-3.0/deploymen...](http://docs.continuent.com/tungsten-
replicator-3.0/deployment-batchloading.html) So long as you have a script for
it: [https://code.google.com/p/tungsten-
replicator/source/browse/...](https://code.google.com/p/tungsten-
replicator/source/browse/branches/replicator-3.0.0/replicator/samples/scripts/batch/mysql.js)

EDIT: A fun extra, [https://mariadb.com/kb/en/mariadb/documentation/storage-
engi...](https://mariadb.com/kb/en/mariadb/documentation/storage-
engines/connect/connect-table-types/connect-table-types-data-files/#csv-and-
fmt-table-types)

~~~
digital-rubber
think it would be actually funny if you did, bogus data that is of course.

But uploading your files to some site that reformats plain text, this does not
belong on hacker news imho.

------
phpnode
MySQL already supports loading CSVs directly, why convert them first?
[http://dev.mysql.com/doc/refman/5.1/en/load-
data.html](http://dev.mysql.com/doc/refman/5.1/en/load-data.html)

~~~
blowski
There are occasions where you can't just do a straight insert. Perhaps you
need to conditionally create a foreign record, or you want to format a value
on insert instead of doing it directly in the CSV.

That said, I can't see how the OP actually helps me with that.

~~~
jeremysmyth
LOAD DATA INFILE can't conditionally create a foreign record, but you can
certainly format a value on insert by using the SET clause, for example:

    
    
        LOAD DATA INFILE 'people.txt'
        INTO TABLE Person (@first, @last, date_of_birth)
        SET full_name = CONCAT(LEFT(@first, 1), '. ', @last);
    

You can even do funky stuff like lookups:

    
    
        LOAD DATA INFILE 'projects.txt'
        INTO TABLE Projects(@first, @last, project)
        SET person_id = 
         (SELECT id FROM people 
          WHERE name = CONCAT(@first, ' ', @last) LIMIT 1);

~~~
michaelmior
Cool! I wasn't aware you could do that sort of thing with LOAD DATA INFILE.
Thanks :)

~~~
blowski
I didn't know that either - thanks for sharing.

------
agildehaus
This submits to your server. If I have sensitive data, I sure as hell don't
want that going somewhere where it might be stored.

Why isn't this just done client-side in Javascript?

~~~
easytiger
why isn't this a trivial bash script more like it.

~~~
agildehaus
Indeed, but that's already been done.

------
eli
The first python script I ever wrote does csv to SQL:
[https://github.com/elidickinson/csv-
tools/blob/master/csv2sq...](https://github.com/elidickinson/csv-
tools/blob/master/csv2sql.py) It's very rough, but I actually still use it on
occasion. The Python csv library is quite good. There's a lot that can go
wrong that LOAD DATA can't handle.

------
NicoJuicy
I actually have some code (.Net) that loads xml, csv, ... into objects.

There is also a visual drawing to help you connect the dots between the
properties of the uploaded file vs. the properties of the objects.

This way, all information gets loaded into the db from a single import, in the
correct table.

I could seperate it in a opensource project i suppose, if anyone is
interested.

~~~
kstenson
I would be interested in looking at that if you opensource it.

~~~
NicoJuicy
You're a .Net developer then?

------
itazula
I submitted the following file (note the blanks): $ cat testblanks.csv 1997,
Ford, E350

The following was returned: CREATE TABLE testblanks (Column_1 NUMERIC(32,
16),Column_2 VARCHAR(5) CHARACTER SET utf8,Column_3 VARCHAR(5) CHARACTER SET
utf8); INSERT INTO testblanks VALUES (1997,' Ford',' E350');

That was in accordance with
[http://tools.ietf.org/html/rfc4180](http://tools.ietf.org/html/rfc4180)

Cool!

But the CSV format has a lot of "gotchas." See, for example,
[http://discuss.fogcreek.com/joelonsoftware3/default.asp?cmd=...](http://discuss.fogcreek.com/joelonsoftware3/default.asp?cmd=show&ixPost=99824&ixReplies=25).
Especially the rant about the Microsoft version of CSV.

~~~
masklinn
> But the CSV format has a lot of "gotchas."

In the wild, CSV is less a format and more a meta-format (and a subset of the
*SV family) with plenty of variability between dialects.

------
vijayr
Nice idea.

Can I suggest another idea? Take JSON files, and convert them into MySQL,
postgres etc insert/update scripts. There is a need for this. For example, I
run this hobby site [http://kivatools.com](http://kivatools.com) that imports
tons of data from kiva.org - their data dumps are in json/xml format. so I've
a script that refreshes the db every night, with a couple of gigs of data. It
would be very useful if there was a service that can take a url (or file), and
convert json/xml into mysql scripts.

------
Gonzih
I think it can be done in few lines of ruby/python/whatever code. Without
submitting sensitive data to remote unknown server. IMHO it would be much
better idea to release code as opensource.

------
sheetjs
You should be able to do this entire thing in-browser. Going further,
leveraging WebSQL or SQL.js, it should be possible to pump the queries into a
client-side database and run queries.

Shameless plug: I built an in-browser tool to generate SQL queries from Excel
XLS/XLSX/XLSB files: [http://sheetjs.com/sexql/](http://sheetjs.com/sexql/)

Discussion:
[https://news.ycombinator.com/item?id=7392665](https://news.ycombinator.com/item?id=7392665)

------
onion2k
Tried it with a file containing this:

    
    
        Field1,Field2,Field3,Int,Float,Quoted
        This,is,a,12,1.2,"""file"""
    

...and it returned an empty textarea.

------
quasiben
This is great! People should have better and faster ways of handling CSV -> DB
interactions. We recently added native CSV loading into blaze for MySQL,
PostgreSQL, and SQLITE,
[https://github.com/ContinuumIO/blaze/blob/master/blaze/data/...](https://github.com/ContinuumIO/blaze/blob/master/blaze/data/sql.py#L283)

MongoDB is coming soon.

------
gregcsv
Here's one I made a few years ago:

[http://utilitymill.com/utility/csv_to_create_table/](http://utilitymill.com/utility/csv_to_create_table/)

You just paste the content from excel or a csv.

It actually tries to figure out the data types and creates a table for you as
well.

------
stef25
Tested it out with GeoIPCountryWhois.csv from Maxmind, seems to work ok.
However selecting the output is almost impossible, locks up Chrome. This isn't
your apps fault of course. Usually I use Sequel Pro for this, which allows me
to map csv fields to db columns

------
abluecloud
I've found phpMyAdmin's CSV insert feature has been very useful.

Allows for quite a few options.

------
pessimizer
[http://neilb.bitbucket.org/csvfix/manual/csvfix16/sql_insert...](http://neilb.bitbucket.org/csvfix/manual/csvfix16/sql_insert.html)

------
squidmccactus
So does this work with CSV?

------
digital-rubber
What is this for ridiculous product/ attempt?

HACKER news, not noobs-that-cant-do-shit.

~~~
digital-rubber
Quite amusing, people down voting my 'minimal worded critique', apparently
really like this tool and see a benefit that is non-existing. Apparently can't
think beyond a solution to a problem.

Reformatting of ascii/text data, a csv, is not rocket science. Nor 'hacker
(news) quality'.

And i hope that nobody afterwards realises that they should not have uploaded
their csv file to your website hosted by nsa friendly google. Also adding a
warning that people are submitting their data to (google, 3rd party servers
you don't control) servers, would be nicer, and makes people perhaps aware
that their data goes further then only their own browser.

~~~
easytiger
you were not elegant or polite, but are in fact correct

