Hacker News new | past | comments | ask | show | jobs | submit | ejwhite's comments login

Interesting.

I need to open a very large CSV file in Python, which is around 25GB in .zip format. Any idea how to do this in a streaming way, i.e. stopping after reading the first few thousand rows?


> I need to open a very large CSV file in Python, which is around 25GB in .zip format. Any idea how to do this in a streaming way, i.e. stopping after reading the first few thousand rows?

Replace the `file_paths` list in my proof of concept with your large file(s), delete the rest (lines 61-68, 77-79) and it should just work.


Works fine with Python's standard library. Files in a ZipFile can be read in a streaming manner. There is no need to store all the data in memory.

    import io, csv, zipfile

    max_lines = 10
    with zipfile.ZipFile("data.zip") as z:
        for info in z.infolist():
            with z.open(info.filename) as f:
                reader = csv.reader(io.TextIOWrapper(f))
                for i_line, line in enumerate(reader):
                    if i_line >= max_lines: break
                    print(line)


This is true when writing to a file. The goal of my PoC was to not write a file and instead to stream to the web browser.


Nice explanation, thank you for sharing. Do you have any experience working with mobile apps? I'm wondering if any aspects of the design process you outlined differ when working with mobile apps?


Well you have a vote right here for mobile flows, I think that would be awesome!

May I ask... do you ever construct information architectures (aka sitemaps) for your clients? If so, what tools would you use for this?


Heh, I went to grad school with him. Haven't heard what he's been up to in over a decade.

I was solidly in the "C++" column back then, but have since become a data scientist who now uses numpy/Python for all machine learning. That talk was a very interesting, helped me to understand what they're doing in my old field these days. Thanks for sharing.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: