Hacker News new | past | comments | ask | show | jobs | submit login

> A python script loading a 1GB file is likely to take more than 1 minutes on any machine.

Since when? I just tested it, and I can process a 1gb CSV file on my machine using the csv module in Python in ~10 seconds. My machine isn't a world record setting machine, either.




This is OP, but that is amazing.

I tried reading 1GB CSV and add all the rows into Sqlite, and it definitely takes minutes. This happens to both JS and python.

Constructing a giant INSERT should already exceed 1 minutes (or transforming any 1gb-sized string).

Would you mind sharing your Python snippet?


Not sure how you'd do this from Python, but you can fire up sqlite3 from the command line and run

    .mode csv
    .import filename.csv table_name
On my 2-year old Windows laptop (i7, lenovo x1) importing a 2M rows 1.4GB file takes about ~70 seconds.

So not 10 seconds but no need to create an insert statement (I'm sure you can access this functionality from python if you need to)


I use MacBook Pro 2020, so it is probably faster. Also 1.4 GB is quite difference from 1GB. These factors can multiply the run time.

Superintendent uses the same approach (i.e. use the CSV extension).

But I don't think we should call that python ... since it essentially just calls C code in Sqlite.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: