

Show HN: ‘Parallel psql’, for workflows in PostgreSQL/PostGIS - i_like_postgres
http://github.com/gbb/par_psql/    

======
i_like_postgres
Hi everyone

I’ve written a tool (par_psql) which makes parallelisation easier for
PostgreSQL/PostGIS users, by providing a new piece of syntax.

With —-& inline, it runs queries or groups of queries in parallel.

Without —-&, it synchronises parallel work then runs subsequent code normally.

This allows easy control of parallelism and synchronisation inline within your
SQL script.

The tool is backwards compatible with existing psql scripts, and par_psql
scripts are backwards compatible with psql. It should work with any version of
PostgreSQL. The only dependencies are bash and psql.

Benchmark and example code is provided at
[http://github.com/gbb/par_psql](http://github.com/gbb/par_psql).

 _Quick example_

    
    
        create table a as ...
        create table a1 as ...  —-&
        create table a2 as ...  —-&
        create table c ...
    

_Some cool uses_

1\. GIS and any other discipline where you prepare diverse source datasets in
a multi-stage workflow before integrating them together.

2\. Where you have CPU-intensive queries, split the work via one field (e.g.
ID) and create parallel temp tables. UNION the results.

3\. Add “Preview runs”, that complete progressively using subsets of the data
without delaying the main task.

4\. Create scripts where several tasks run at fixed times after the script
begins (use pg_sleep() and run them in parallel).

It’s available under the postgresql open source license.

It's a quick hack and version 0.1, so please be kind with any criticism/bug
reports. That said, it works well for me and it's easier than messing around
with Gnu Parallel/BASH/crontab/sql combinations.

Enjoy! :-)

Graeme

