
Faster bulk loading in Postgres with copy - craigkerstiens
https://www.citusdata.com/blog/2017/11/08/faster-bulk-loading-in-postgresql-with-copy/
======
vimalbhalodia
Coming from a heavy MSSQL background two things pleasantly surprised me about
PG's copy, both of which are mentioned rather casually in the article:

1\. The CSV data can be streamed over STDIN that is read over the driver
connection. This takes the "write a file over a networked filesystem that the
DB server has access to" overhead completely out of the equation.

2\. The overhead of bulk insert is shockingly low - in some ad-hoc benchmarks
we did for our use case, we were breaking even between regular batch prepared
statement inserts and copy-based bulk insert at around 10 records, and by 100
records we were already seeing the same factor of speedup that the article
demonstrated.

------
lokedhs
I would be interested to see the performance difference between raw copy and
using pgloader. [http://pgloader.io](http://pgloader.io)

~~~
pritianka
+1 to that.

------
rurban
I bet dropping the indices, insert and reindex would be even faster.

~~~
craigkerstiens
In this case there weren't any indexes on my tests, but yes in many cases bulk
insertion without the indexes and creating them after can speed things up.
That's really useful when you're doing rollups from raw data and want to
create the indexes after.

