Hacker News new | past | comments | ask | show | jobs | submit login
Postgres 11 Beta 1 released (postgresql.org)
203 points by hardwaresofton on May 24, 2018 | hide | past | favorite | 68 comments



> PostgreSQL 11 introduces SQL stored procedures that allow users to use embedded transactions (i.e. BEGIN, COMMIT/ROLLBACK) within a procedure. Procedures can be created using the CREATE PROCEDURE command and executed using the CALL command.

I'm pretty sure this is going to be my favorite feature in v11 as it will facilitate conditional DDL for the few remaining things at aren't already allowed in transactions, in particular add values to enums.

Prior to v11 you could create new enums in a transaction but could not add values to an existing one. IIUC how this v11 feature works, you could now have a stored procedure check the current value list of the enum and decide to issue an ALTER TYPE ... ADD VALUE ... at runtime, optionally discarding any concurrency error. All of that could run from a SQL command, no external program needed.

> Prior to PostgreSQL 11, one such feature was using the ALTER TABLE .. ADD COLUMN command where the newly created column had a DEFAULT value that was not NULL. Prior to PostgreSQL 11, when executing aforementioned statement, PostgreSQL would rewrite the whole table, which on larger tables in active systems could cause a cascade of problems. PostgreSQL 11 removes the need to rewrite the table in most cases, and as such running ALTER TABLE .. ADD COLUMN .. DEFAULT .. will execute extremely quickly.

... and this is a close second favorite. Having NOT NULL constraints on columns makes schemas much more pleasant to work with, but for schema migrations that means requiring DEFAULT clause to populate existing data. While there are ways to slowly migrate to a NOT NULL / DEFAULT config for a new column (e.g. add column without constraint, migrate data piecemeal, likely in batches of N records, then enable constraint), having it for free in core without rewriting the table at all is simply awesome.


Feature author here. Running not-allowed-in-transaction-block DDL, such as VACUUM, still won't work in stored procedures. Room for future improvement.


I use DO for this sort of thing.


You also cannot run VACUUM inside a DO block. It's the same thing underneath.


No, but I use it for the sort of DDL that's missing IF [NOT] EXISTS syntax.


> ... and this is a close second favorite. Having NOT NULL constraints on columns makes schemas much more pleasant to work with, but for schema migrations that means requiring DEFAULT clause to populate existing data. While there are ways to slowly migrate to a NOT NULL / DEFAULT config for a new column (e.g. add column without constraint, migrate data piecemeal, likely in batches of N records, then enable constraint), having it for free in core without rewriting the table at all is simply awesome.

I'm not 100% sure but I don't think that's what it says. If you create a nullable column with a non-null default it won't rewrite the whole table anymore. You could get this behaviour in PostgreSQL 10 by creating the column and setting its default in separate DDL statements (all in a single transaction).


> You could get this behaviour in PostgreSQL 10 by creating the column and setting its default in separate DDL statements (all in a single transaction).

But that'd mean that the existing columns wouldn't have the DEFAULT value. And thus manually would have to update the whole table. Whereas the facilities in 11 set it on all columns without rewriting the whole table.


> We have heard your frustrations and have now added the ability to quit the command-line using the keywords quit and exit and hope that quitting a PostgreSQL session is now as enjoyable as using PostgreSQL.

o.O

I always tought the right way is CTRL+D.


Surprisingly highest voted question on stackoverflow about postgres is exactly "how to quit?" https://stackoverflow.com/questions/tagged/postgresql?sort=v...


They should have added !wq as a way to exit also.


C-x C-c is much better !


Not too fr off if you, like me, prefer writing your SQL in sql-mode and have a comint buffer with psql you send your queries to!


Or !x for maximum efficiency ;-)


What else uses that?


Vi / vim.


You probably meant `:wq!`.


:D


Hope Mariadb gets a page out of their book and stops Ctrl-C from exiting their command-line tool.


I can not wait for this to happen! Can be frustrating when you accidentally exit the client. A few months ago someone took a stab at it but its not been merged yet.

https://github.com/MariaDB/server/pull/703

https://lists.launchpad.net/maria-developers/msg11217.html

https://jira.mariadb.org/browse/MDEV-14448


Also \q


CTRL + D is news to me, i’m a \q guy too.


Ctrl+D is more generic, as it sends the signal to close the pipe (i.e. stdin from the perspective of psql).


It's a standard quit command for shells in UNIX – e.g. if you hit ^D in a Bourne shell it will exit as well.


It's not really a "command", though. Unlike, say, Ctrl+C, which becomes a SIGINT signal that the program receives and gets to decide how to handle, Ctrl+D never reaches the program as a separate recognizable thing that you can write a handler for.

Instead, Ctrl+D is the way to ask the TTY driver to pass the program an EOF instead of input in response to its next read(2) call to its stdin. It's the same signal that the program would receive if you had been piping input into the program from a file, and that file reached the end. Thus, any CLI program built to handle being piped into, will "automatically" support Ctrl+D as a way to quit out of it.


This is how you quit.


This also works in Mysql.


I would argue CTRL+D is the right way since it works in almost all shells, and I am surprised not more people know it.


Ctrl+D has a learning curve because its default meaning (i.e. when you're not messing with termios) is quite surprising. From `man tcsetattr`:

> VEOF (004, EOT, Ctrl-D): End-of-file character (EOF). More precisely: this character causes the pending tty buffer to be sent to the waiting user program without waiting for end-of-line. If it is the first character of the line, the read(2) in the user program returns 0, which signifies end-of-file. Recognized when ICANON is set, and then not passed as input.

In other words, most programs that exit upon Ctrl+D never see the Ctrl+D, they see an empty result from read() on stdin and interpret that as end-of-file. However, that only works when the tty buffer is empty. Try this: On a Unix shell, run `cat`, type something without pressing Enter, then press Ctrl-D instead.


Every year I get to teach a handful of people about CTRL-Z. (CTRL-U is my favorite teletype command). We hardly ever get to CTRL-D.


You do know that doesn't actually terminate anything, right?


What, CTRL-Z? That’s the point.

A couple minutes into a ten minute task someone goes whups I should have backgrounded that and they CTRL-C, up arrow, & and enter, and I say something like, “dude... we need to talk about CTRL-Z bg”


If you want to try it without running an installer on macOS, you can try the latest build of Postgres.app: https://github.com/PostgresApp/PostgresApp/releases/tag/v2.2...

(this build includes all the latest supported versions of PostgreSQL and the Beta of 11)

It doesn’t include the JIT stuff yet, because that needs a lot of extra things (it needs the LLVM runtime library). I still need to look into how to best package that, and I’m not sure all the extra size is worth it. I’m excited to hear all about it next week at pgcon!


This is super cool. Anything like that for Linux and/or Windows?


Woah, the JIT for queries sounds really interesting! I'm curious what that will do for performance, especially for linear scans of a large number of rows, which I imagine is where you would see the most improvement. I wonder if there are any other databases that do this, and what the advantages/disadvantages are? I'm fairly certain AWS Redshift does this also.


> Woah, the JIT for queries sounds really interesting!

Thanks (author here ;)). Note it's currently not the actual queries themselves that get JIT compiled (started to work on that for v12), but evaluation of expressions inside queries. WHERE clauses, aggregates, projections (SELECT ... lists), grouping conditions are now all handled through expression evaluation and therefore can be JITed. The control flow between different type of executor nodes not yet however.

Additionally tuple deforming (i.e. converting the on-disk representation into a more easy to use / faster to access in-memory representation) is also JITed. But currently only when done from within an expression, but that's mostly the case.

> especially for linear scans of a large number of rows, which I imagine is where you would see the most improvement.

It's basically a benefit whenever expressions get invoked on a large number of rows. That can be aggregates over a large sequential scan, but it can be beneficial for a good chunk of other types of queries too.

> I wonder if there are any other databases that do this, and what the advantages/disadvantages are? I'm fairly certain AWS Redshift does this also.

Yes, there's several other databases that do this too. The disadvantages are largely that it adds processing overhead / increases latency till the query can be actually processed. Which means that if the, currently pretty simplistic, logic to guess whether it's worth to JIT is wrong, you make the query slower due to the added effort to JIT the expressions.

You can hide a lot of that by doing JIT compilation in parallel and doing smart caching, but that's currently not done yet.


Compiling to either a P-code, or to native code, for query execution is quite an old technique. It is generally only a win under certain conditions, like most optimizations.


Now is the time to test this. From my early experiments, you can expect to get speedups for queries that last longer than a few seconds. A lot depends on whether I/O, caching, etc. dominates. The only disadvantage is that it takes time to do the JIT compilation, so if the query is already fast, JIT compilation will probably make it slower. Hence, there are ways to adjust when it should be used.


SQLite3 compiles to bytecode, and would be a very good candidate for JITting the bytecode.


I’d assume that most queries are limited by memory / disk access time, so for many queries the savings will be marginal. But I imagine that it could make complex aggregates a lot faster?


True, I guess you'd see the advantage most when there is a complex WHERE clause or some complex projections and aggregates. Also, looking here https://www.pgcon.org/2017/schedule/attachments/467_PGCon%20... it seems they can JIT the process for creating an index, which has some modest improvements. I imagine there are plenty of things like that that could also benefit from compilation.


> t seems they can JIT the process for creating an index, which has some modest improvements

We could, but currently don't yet. Not enough time / too short days...


> I’d assume that most queries are limited by memory / disk access time

That's not true for all that many analytics statements actually. Good storage systems can deliver data quite fast, and in many cases the buffer cache can hide a lot of the access latency too.

There's obviously issues with memory access latencies, but that's actually something that JIT compilation can hide with, because it increases the amount of work that can be done in the out-of-order window. Without JIT there's just too many instructions for that to hide all that memory access latencies.



>> UPDATE statements issued to a partition key now move affected rows to the appropriate partitions

I wonder who's the author of this feature. I must buy him/her a beber/coffe/whatever


The release notes say Amit Khandekar. (I don't know him.)


> "PostgreSQL 11 introduces SQL stored procedures that allow users to use embedded transactions (i.e. BEGIN, COMMIT/ROLLBACK) within a procedure."

Did PostgreSQL not have stored procs before?


Stored functions, not procedures. The difference is explained here: https://www.postgresql.org/docs/11/static/xproc.html Basically, procedures do not return values, and can have nested transactions. (Today, functions are used to accomplish much of what seems more appropriate for a procedure.)


They do return values:

https://www.postgresql.org/docs/devel/static/sql-call.html

    CALL executes a procedure.
    
    If the procedure has output arguments, then a result row will be returned.
and from https://www.postgresql.org/docs/devel/static/sql-createproce...

    argmode

        The mode of an argument: IN, INOUT, or VARIADIC. If omitted, the default is IN. (OUT arguments are currently not supported for procedures. Use INOUT instead.)


Not in the sense of a function; you can't call a procedure from within an expression. Interesting that they can produce output rows though.


Yes, that's true.


I think the new part is "allow users to use embedded transactions (i.e. BEGIN, COMMIT/ROLLBACK) within a procedure."


Stored procedures are technically new with 11, but I think the only functional (rather than syntactic) difference between the new support for stored procs and stored functions which happen to not return anything is the transaction control for stored procs. (And stored functions returning nothing are often referred to informally as stored procedures in postgres discussions.)


Stored procedures can return values. See https://www.postgresql.org/docs/devel/static/sql-call.html


That's a rather nice feature that Transact-SQL (both Sybase and Microsoft) has had for a long time. It should make it easier for some folks to convert over.


Just as a point of clarity on that, SQL Server only allows one encapsulating transaction per session. You can have begins and so on within a stored procedure, but really that is little more than a "just in case" defense (e.g. just in case the caller didn't already do this), as you can only commit or rollback the outer transaction.

If you want to have atomic transactions within a transaction you have to do hacks like make a distributed call back to yourself, in that new session creating another transaction.

It's an element of SQL Server that confuses people to this day.


I was more on the Sybase side and dealt with an environment where the calling program didn't initiate a transaction so it worked quite well for me. We were kind of touchy about having a transaction open longer than it needed to be.


Absolutely! I come from a historically Microsoft shop, and the lack of this was by far the biggest barrier when initially learning PostgresSQL since I was still in the mindset of designing stuff as I would in Sql Server.


I'm probably wrong, but I think Stored Procedures are being introduced in 11.

From the announcement: "SQL Stored Procedures

PostgreSQL 11 introduces SQL stored procedures that allow users to use embedded transactions (i.e. BEGIN, COMMIT/ROLLBACK) within a procedure. Procedures can be created using the CREATE PROCEDURE command and executed using the CALL command."

It doesn't make sense to teach people how to create procedures if they already have them before. The docs[0] also don't have a section for CREATE PROCEDURE before 11[1].

It is a great tool anyway, better late than never.

[0] https://www.postgresql.org/docs/10/static/sql-commands.html

[1] https://www.postgresql.org/docs/11/static/sql-commands.html


You're correct, "CREATE PROCEDURE" didn't exist until PG 11. But "CREATE FUNCTION" has existed for ages, and it covers most of the common cases for server-side logic.


Yes it did. The "embedded transactions" is the new feature.


No, it didn't.

It had functions, not procedure. You can emulate (most of) procedure using function that return void.

See also https://blog.2ndquadrant.com/postgresql-11-server-side-proce...


Yes, that's true, although slightly pedantic. I'm sure there are procedural cases that PG functions can't emulate, although I don't think I've ever encountered one in practice.

Having said that, the addition of transaction support is a great step forward.


Only one: the ability to BEGIN/ROLLBACK/COMMIT in the body of the function.


And, this is just adding new syntactic support for SQL standards compliance, right? I don't think it's really a new control-flow if you ignore syntactic differences.

I think you can get the equivalent behavior out of BEGIN/EXCEPT blocks in plpgsql functions. I think this behaves as if you used SAVEPOINT and ROLLBACK TO SAVEPOINT which have been directly supported in postgresql but not in functions.


> most of

Curious, what are those procedures which can't be emulated as functions. Maybe you have a chance to describe some example?


There is only one thing that procedures can do that functions can't: nest transactions inside the transaction that calls it.


Super excited about innovation in RDBMSs. ACID and relational models are (in my opinion) highly underrated, but need more modern syntax & flexibility. What I really want is Prolog-like, deductive features in a traditional RDBMS, but that may have to wait.


It's very exciting to see table partitioning native support. I can't wait until foreign keys point to a partitioned table are supported with cascading. This single feature makes many database designs unprecedentedly salable especially for situations that are based on time (ie sales this quarter are all we care about and must be fast).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: