
PostgreSQL Magic - websec
http://goto.project-a.com/postgresql-magic/
======
timonv
As a fellow Postgres amateur wizard, love the positive attention that postgres
seems to be getting more and more, and some half databases less and less
(unless you actually need map-reduce, ofcourse. You probably don't.
/trollface)

What I miss though in this article, and where I think postgres shines majorly
compared to other rel dbs, are window functions.

It allows you to apply a partition to a set. You can do some great wizardly
magic with this, like 'give me each row matching this and that, which matches
the last occurence of given column'.

Edit: WITH clauses (CTE) are great for avoiding a lot of nesting with
subqueries and/or reusing subqueries throughout the main query. They have
added functionality for recursion, but I suppose that unless you do some kind
of tree traversal on big data sets, benefits of that are soso, readabillity
and all that.

Edit2: I had to double check this, I never use custom types, using UNNEST()
ARRAY[] on a custom type is superfluous. Just use ROW().

~~~
greggyb
Window functions are not a Postgres thing, but part of the SQL standard with
(varying, of course) support across most of the major RDBMS's.

I've seen comments like this several times that seem to imply that Postgres is
exceptional either due to having window functions, or like in this one where
the tone sounds as if Postgres does them particularly better.

I'm curious, as I do almost 100% of my production work in MS SQL Server, where
I am able to do everything that comes up when "Postgres window functions" are
mentioned, whether there is extra functionality in Postgres compared to other
RDBMS's window function implementations?

Is there a reason that Postgres seems to get special attention for window
functions?

Thanks.

~~~
asolove
I think it's largely a question of community.

Many users of MS SQL or Oracle learned advanced SQL or platform-specific
features which they take for granted. They work at companies that spend lots
of money for DBs and hire people who use them very well.

Postgres also has a core community of very knowledgeable folks. But most of
its user base, and especially those new to using it, would otherwise use
MySQL. They only know small bits of SQL, have only a basic understanding of
why any company would need a DBA, and only learn things like this when they
absolutely need them for a task. And, for that group of people, MS and Oracle
probably aren't serious choices, so the fact that a free database has these
cool features seems exciting to them.

(Note that I don't say this with distain. I'm a former-mysql user who now uses
postgres on Heroku and am constantly learning things like this. I wouldn't
even have understood your perspective until I started working with people who
knew Oracle and MS SQL so thoroughly.)

~~~
danmaz74
> so the fact that a free database has these cool features seems exciting to
> them

It IS exciting to them. And to anybody who can't afford Oracle or other
expensive enterprise solutions.

------
rosser
Quibble: the "now()" function doesn't return the time of _statement_ start; it
returns the time of _transaction_ start.

    
    
      $ psql -q
      rosser=# begin;
      rosser=# select now();
                    now              
      -------------------------------
       2015-09-11 02:28:54.262142-07
      (1 row)
      
      rosser=# select now();
                    now              
      -------------------------------
       2015-09-11 02:28:54.262142-07
      (1 row)

~~~
cleansy
That's also not completely acurate:

    
    
      postgres=> SELECT now(), now(), clock_timestamp(), clock_timestamp();
                  now              |              now              |        clock_timestamp        |       clock_timestamp

\-------------------------------+-------------------------------+-------------------------------+------------------------------

    
    
      2015-09-11 09:57:00.414422+00 | 2015-09-11 09:57:00.414422+00 | 2015-09-11 09:57:00.419087+00 | 2015-09-11 09:57:00.41909+00
    

Now() stays the same for the entire statement as well. clock_timestamp()
doesn't.

~~~
fgutmann
If you ever manage to have multiple transactions in one statement please email
me.

~~~
AlterEgo20
Easy. Just use dblink or foreign table to execute part of the statement inside
other transaction. dblink to the same server/db is often used as way to create
an "AUTONOMOUS_TRANSACTION"

------
whistlerbrk
So many amazing features that are incredibly relevant to modern web
development. I've switched over a personal project from MySQL to PG recently
and this time I'm not looking back.

On 9.4 I have the HSTORE and JSONB types as well as range types which are
incredibly useful. If you have a solid language library wrapper for PG you can
spend far less time mangling data from one format to the next and just get to
work. I love it.

------
zrail
I have seen magic done with custom types and aggregates. For example, I know
of projects with run-length-encoded bitsets and custom aggregates for doing
set operations and counts, sort of like Redis' bitset commands but built into
PG. This is a massive space optimization, because otherwise you'd just have a
join table with zillions of tiny rows.

------
mhd
I have to admit that I'm still not quite sure about arrays in relational
databases. Don't get me wrong, I use them all the time, but it kinda feels
like when you've got tables that you just _know_ could be normalized more
thoroughly.

Also: If someone has a good version-control wrapper for stored procedures,
that would be swell. And while I'm doing the wishful thinking shtick, maybe a
Coffeescript-like preprocessor and a good linter?

~~~
ahachete
> I have to admit that I'm still not quite sure about arrays in relational
> databases. Don't get me wrong, I use them all the time, but it kinda feels
> like when you've got tables that you just know could be normalized more
> thoroughly.

Why is that? There are many use cases for data (like vector data) which really
needs an array. And it would be unwise to store it as columns. Think of, for
example, matrix data or a practically unbounded number of double values coming
from a sensor. Plus, PostgreSQL has a limit in the number of columns (1600) of
a table, of which you could run out soon if representing this kind of data as
regular columns rather than array values.

~~~
hobs
Why wouldnt an unbounded set of values related to a sensor just fit in a one
to many model of data? (Instead of a constantly updating row, simply inserting
more rows)

It seems like you would be trading one fat row for something the database does
well (unless you always want all of the sensor data every time)

~~~
ahachete
An array is basically a set of homogeneous fields indexed by number, which are
probably accessed in constant-time. A relational table is a completely
different beast. Even after disregarding performance (which might be an
abysmal difference in many use cases) it's not obvious how to access by index
without "hacks". And if you consider multidimensional data, things get tougher
without arrays...

------
sigma2015
What tool do you use for visualizing table relationships (foreign key
constraints f.x.) in PostgreSQL?

~~~
claysmithr
[http://www.pgadmin.org/](http://www.pgadmin.org/) ?? but I am also new to
postgresql. There are also some commercial tools available.

~~~
sigma2015
how?

------
dsugarman
The best part of this for me was the batch updates. I am a little confused at
the need for a user defined type that is a copy of the schema of the table.
Wouldn't it be better to do use:

::table_name%ROWTYPE;

That way you wouldn't need to maintain a table and a type with the exact same
schema, no?

