
Using PostgreSQL Arrays The Right Way - drob
http://blog.heapanalytics.com/dont-iterate-over-a-postgres-array-with-a-loop/
======
adamors
While this is really cool, can anyone tell me if it has become any easier to
write stored procedures/functions?

Last time I tried writing stored procedures I went into a deep depression from
the lack of a proper IDE/editor/debugger/tester/anything.

Is there any process out there besides time consuming trial and error? Or does
everyone who writes these knows them by heart and I should just stick to
programming languages?

~~~
beliu
Has anyone tried out python in postgres?
([http://www.postgresql.org/docs/9.0/static/plpython.html](http://www.postgresql.org/docs/9.0/static/plpython.html)).
In the past, I've tried to avoid stored procs for the reasons mentioned above,
but have not yet had the chance to try out python and would be interested in
hearing about people's experiences with it.

~~~
asnyder
Pretty good, you can even access local python libraries for direct use in your
db functions. Only negative is that you have to re-compile your functions to
update the library references.

------
dsugarman
we heavily rely on postgres as well, in the past we have tried to do the
entire backend on postgres as we believe it could be the best solution, but we
have had a terrifically hard time of maintaining stored procs, etc. We still
use them, but a lot less than we probably would and wound up building an ORM
and moving most logic to PHP.

I would be really interested in how you maintain your database code if you are
up for another write up or have some time to chat.

~~~
drob
For now, the following works for us:

\- For plpgsql functions that are required by app queries, we just roll them
out when we write them / when they change (via ansible).

\- For plpgsql functions used in jobs, the job just reloads the relevant
plpgsql functions on the relevant DBs before they start doing anything. (It's
a little wasteful, but not in a way that matters for now.)

\- The UDFs we've written in C don't change too often, but we deploy them
manually when they do.

What are some of the headaches you've had in managing stored procs? How often
is your app code changing / requiring new ones?

~~~
erichurkman
One headache I had in the past was with a highly-available app with stored
procs. It was rare to do parallel app server restarts, they were instead done
in serial. (This did make certain migrations a pain.) Outdated app servers
would continue calling pgpgsql functions in the outdated way until they
received new code.

We solved it with a small wrapper around the calls to plpgsql functions — we
set up a build process to generate unique names that were called by version.
During deploy, we'd have two or more versions of the 'same' function running
in parallel. The last deploy step dropped the now-outdated functions.

It worked relatively well.

~~~
jeltz
Have you thought about using schemas and the search path for this? That is
what Zalando does to deploy stored procedures without risking any downtime.

~~~
erichurkman
If I had to do that again, I would. It's been a while, but as I recall from
that project the database adapter had shoddy support for search paths.

------
waffle_ss
Why use an array instead of a separate table? Are you trying to avoid a JOIN?

~~~
drob
Yes. Our schema is fully denormalized, which is particularly important for
performance on funnels. (See heapanalytics.com/features/funnels)

In particular, to compute where a user drops off in a funnel, I need to scan
one array from left to right, and I don't need to do any joins. This shards
very well, since all of a user's data lives on one shard, and most of the
queries are aggregations, which are simple to reassemble from subqueries.

~~~
ams6110
So why are you using an RDBMS?

~~~
_pmf_
> So why are you using an RDBMS?

Because even a shitty RDMBS is more robust, more secure and faster than any
NoSQL wankery for this kind of use case.

------
thyrsus
Does anyone know offhand whether fixed length values, e.g., bigint, have this
problem?

~~~
drob
No, this isn't an issue for fixed length types, although you're still better
off using unnest instead of explicitly indexing into them.

