Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for sharing your tool!

I think it would be best to illustrate the readme with a problem it tries to solve: currently I don't understand why a simple set of INSERT operations wouldn't suffice.

Also, the phrase "ingest data into Postgres" sounds wrong: it would be postgres, not your tool that would ingest data (if it could), your tool should be described as the one "putting data into Postgres". And thus you probably have named your tool wrong...




Ah the name - you're not the first to mention it! Do you (or anyone lurking...) have any suggestions as to what it might better be called?

On what it does/why it exists, we've kept the README quite light to avoid duplication, with the main bits of the docs at https://pg-bulk-ingest.docs.trade.gov.uk/

But to try to answer the question here:

A set of insert statements - there are lots of cases where this would be fine, so pg-bulk-ingest (/its future name ;-) would be unnecessary, and so you might as well use insert statements.

But there are lots of things that pg-bulk-ingest does that a set of insert statements don't:

- It uses COPY, which in many cases is (much?) faster than INSERT

- It creates the table if it doesn't exist

- It migrates an existing table to match the definition if it does exist (pg-bulk-ingest can't handle all possible changes, but enough to make it useful to us)

- And it migrates without holding a long ACCESS EXCLUSIVE lock on the table

- It offers a fairly structured way to do high-watermarked ingest - it stores and retrieves a high watermark so you can fetch the data "since" some previous ingest, with the exact meaning of "since" custom to each source you're ingesting from.

- And it (now) combines COPYs with multiple tables, without loading all incoming data into memory at once, and keeping the ingest within a single transaction, so other clients see all the data, or none of it.

(The high watermarking I think is maybe the least clear of these in the docs, while also one of the most important things. Would really like an example that is somehow standalone that also makes it clear what it does...)


pg_alchemy_objects_writer




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: