Hacker News new | past | comments | ask | show | jobs | submit login
Exploring PL/pgSQL: Strings, arrays, recursion, and parsing JSON (eatonphil.com)
69 points by stevekemp on Oct 25, 2021 | hide | past | favorite | 21 comments



I've written a bit of PL/pgSQL, but I'll be honest, the basis of most of it has been copy/paste from Stackoverflow and tweak a couple of things here and there.

More recently I learned that you can also write python functions 'natively' (ie no need to install any modules) in postgres, which honestly seems like a much better idea all round than learning PL/pgSQL, even if you don't know any python already.

Can anyone comment on if there are any feature differences between the two, or why that might be wrong?


PL/pgSQL is a first class language with first class semantics. As @sureglymop mentioned, you're probably going to get better performance than an interrupted or compiled language.

Another consideration is portability. Perhaps you're migrating to or from another database that has its own Procedural Language. You're probably going to have an easier time dealing with PL -> PL when migrating stored procedures.

There are times when it's just easier and more intuitive to glue a bunch of SQL together using a PL, than whipping up some python/js/etc...

Also keep in mind that depending on the DB, you may get a scripting language like python or js, but you can't use any third party libraries.


I find PL/pgSQL preferable if only because it fits most naturally in the database environment. Queries fit very naturally into the code with little special treatment; being able to rely on the full power of PostgreSQL's type system in my logic without any additional accommodations; exception/error handling feeling more natural and integrated;and as others have noted, certain performance advantages (usually).

PL/pgSQL has a fairly low complexity in my experience compared to many other languages... More often it's the SQL/database conceptual part that I see people struggling with most. PL/pgSQL in some ways makes that a little more front and center... But if you're writing functions and procedures you're really not going to escape having to come to terms with the database. May not be the issue you were experiencing, but expect that to be more common.


Well, PL/pgSQL is compiled but there is no semantics checking at compile time. So while you may have better performance depending on use case etc. it can be a bit hard to work with.

I wrote a lot of Oracle PL/SQL at my last job* and I will say if you're not familiar with that kind of database language I would highly recommend Python over it. First think about whether you really want to have business logic in your database and then choose what you know best.

*: It was messy and I am glad that that job is over.


To my knowledge, AWS RDS Postgres does not support PL/Python, if that's your database.


Ah that's interesting to know, thanks. Part of the premise here was it being available everywhere including the managed services. If that's not true, or it's not trivially installable, it removes much of the advantage I think.


What's the general consensus of what logic should remain in the DB? There are a few places I write SQL and call it from a host language and think "this feels like it should all be in SQL, but it'd be much more verbose". Just interested in what anyone here thinks.


I've stuck with thinking of Pl/* as an aid to data normalization. This way i can look up records like country codes in the database but keep business rules in the app.


I use plpgsql for constraints but try to avoid any form of mutation mostly based on other folks horror stories.


The idea of defining and executing a recursive function inside my production database seems a little scary. Might want to build in your own recursion limits to that one. Is there a way to set execution timeouts or something?


There's a parameter that sets a memory consumption limit on potentially recursive functions: max_stack_depth.

(https://www.postgresql.org/docs/14/runtime-config-resource.h...)


How do people deploy this stuff declaratively, especially cleaning up old functions? I'm still surprised something terraform-esque doesn't exist.

Bunch of SQL scripts with "CREATE OR REPLACE" and cross your fingers? Make a schema that a script can wholly own, then blow the whole thing away and reinitialize in a transaction?


I don't say this from experience but DDLs in postgres are transactional so I don't understand the problem with wrapping create or replaces in a transaction with no need to cross your fingers.


The transaction behavior is fine, it's more that I've had people delete function declarations from scripts while never cleaning them up in the data warehouse and then you end up with old functions kicking around in prod. My use case is mostly warehousing, not application stuff, so I don't care about full blown migrations. Just "these are the functions that we have declared for use, no more no less".

It seems like with DBT we're getting to declarative table construction, but I'm looking forward to declarative maintenance of other data warehouse aspects.


You can use database schema compare tools for this. You create a clean empty database using the new DDL and use the tool to compare that schema with the schema for the version in test or prod. The tool will then show the differences and can create a sql script to update the database in test or prod.

I've used this a lot with SQL Server and used the Redgate tools for that and that worked very well. For PostgreSQL, it seems there's a free tool from devart: dbForge Schema Compare [0]. I've never used it and have no idea how well it works...

[0]: https://www.devart.com/dbforge/postgresql/schemacompare/down...


> it's more that I've had people delete function declarations from scripts while never cleaning them up in the data warehouse

Ah that makes sense. I guess you could hack around that by having a script reading all your declarations that are checked into code and removing any that exist in the db that aren't in that list, or something.


I personally use Sqitch to manage database migrations. Makes it very easy to set up test databases for all development work.


I use this https://github.com/golang-migrate/migrate in a deploy step to each environment with the upgrade / downgrade scripts committed to the repo alongside the code. The scripts can do pretty much anything you need to do in PG including defining and executing functions.


I am completely new to PL/pgSQL, so forgive me for the BASIC question, but (out of curiosity) for string manipulation is PL/pgSQL too slow compared to modern LUA/Python? The function that I have in mind will get a TEXT and return another TEXT with the PHONETIC representation of this TEXT.


The sort of algorithm you have in mind is usually called a soundex (“sound-index”). Fairly confident order of performance would be PL/pgSQL > Lua > Python.


Thank you very much! I really appreciated it.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: