Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Postgres.js – Fastest Full-Featured PostgreSQL Client for Node and Deno (github.com/porsager)
257 points by porsager on March 24, 2022 | hide | past | favorite | 83 comments



Hi everyone. A bit more than two years ago I released the first version of Postgres.js. A fully featured PostgreSQL driver for Node.js written as a learning experience out of curiosity and annoyance with the current options. It greatly outperformed the alternatives[1] using pipelining and prepared statements, while providing a much better development experience safe from SQL injections. Since then I've been busy building things using it, now running in production, and although quite delayed I'm so happy to release a new major today with some really exciting new features:

1. Realtime subscribe to changes through Logical Replication [2]

It's now possible to use logical replication to subscribe in realtime to any changes in your database with a simple api like `sql.subscribe('insert:events', row => ...)`. Inspired from Supabase Realtime you can now have it yourself in Node.

2. A Safe Dynamic Query Builder

Nesting the sql`` tagged template literal function allows building highly dynamic queries while staying safe and using parameterized queries at the same time.

3. Multi-host connection URLs for High Availability support

It's really nice to be able to quickly spin up a High Availability Postgres setup using pg_auto_failover[3] and connect using Postgres.js with automatic failover and almost 0 downtime.

4. Deno support

It also works with Deno now, completing all tests except a few SSL specific ones which requires fixes in Deno.

5. And much more

Large object support, efficient connection handling for large scale use, cancellation of requests, Typescript suport, async cursors.

[1] https://github.com/porsager/postgres-benchmarks#results

[2] https://github.com/porsager/postgres#realtime-subscribe

[3] https://github.com/citusdata/pg_auto_failover


Congrats on the release - as I mentioned on twitter, your templating design is the best DX i've seen for any PG client.

> Inspired from Supabase Realtime you can now have it yourself in Node.

Very cool! We have some Node servers internally, I'll see if the team want to switch.


Thanks - that's really nice to hear!

That would be really interesting. Are you using `pg` currently, and are you using WebSocket connections in Node to Elixir for realtime, or not using realtime in Node?


yes, we use `pg` in all our Node servers (and no realtime)


Alright. I've got a pg wrapper lying around that can help with migration I can send you ;) I've DM'ed you on twitter, so let me know if that could be useful.


I hit a small type issue migrating from v2 to v3 that's not listed in the migration: https://github.com/porsager/postgres/issues/283


One question - is there a way to explicitly start the connection, instead of auto connection at the initialization?


Yeah, just make a simple query :)

  await sql`select 1`


well.. my bad, somehow I didn't think about that..

anyway this lib looks really cool! definitely will use that instead of hand crafted tagged template functions passed to `pg`.


Super cool, looks great. Interesting to see you built it with Typescript.

I was wondering-- and I know this is a long story, but I am curious:

How does one go about building a PostgreSQL driver for NodeJS in Typescript?

How do you design and begin that sort of thing? What sort of knowledge resources did you rely on (i.e. documentation, other references)?

I googled 'database driver' to get a definition: "A database driver is a computer program that implements a protocol (ODBC or JDBC) for a database connection. The driver works like an adaptor which connects a generic interface to a specific database vendor implementation."


Thanks a lot.

I did not built it with Typescript, just plain js - I don't use Typescript at all(not my cup of tee), but a kind soul (thanks @minigugus) contributed the typings ;)

I actually began it out of curiosity and because I was missing features in the current options. I had been using a wrapper around pg-promise to give me the API i wanted for quite a while, and one day when going through the PostgreSQL documentation I landed on the protocol (https://www.postgresql.org/docs/current/protocol.html) pages. Now the PostgreSQL documentation might seem daunting at first, but once you figure out the structure and semantics it's such an insanely good resource. I had the first POC version working over a weekend, and from there it was about 80% of the way to support all the things I needed. Now the Postgres protocol is actually fairly simple, and I suppose that's why other databases have chosen to use it as well (eg. Postgres.js also works with cockroachdb). I would love to write a longer post about the progress at some point, but I haven't found the time for it yet.


Hey.

Love this lib. I am curious why you decided against using TS here? You mentioned it's not your cup of tea.


I can only assume that I've finally found a soulmate that just don't like writing TS code.

Like, in my full-time job I'm using 100% TypeScript, but I don't really ENJOY writing in it - that's why in my all side-projects I use good old JavaScript.

I think it's just a personal syntax preference.

But I don't want to answer this for the OP, just added my 2cents, maybe he has similar feeling about it :)


Haha.. There's plenty of us out there, it's probably just that we'd rather do actual stuff than talk about doing it. A bit like Typescript - it doesn't really do anything, it just talks about it.

Joking aside, I generally don't want to get into the debate, because I believe people work differently, and there should be room for doing both things. It's just a bit sad that Typescript is being pushed so hard as if it's the only right way.


> It's just a bit sad that Typescript is being pushed so hard as if it's the only right way.

It's being evangelised as if it has zero downsides, it introduces overhead to what can already be a brittle dependency/build chain - providing questionable levels of actual type safety. This is coming from someone who generally prefers typed languages too, Typescript to me is the exception to that rule.


Yep! I'm the same way -- it comes from a love of simplicity and an affair I had with Lua, which I find marvelously simple and inspirational.


Some people really love the simplicity that comes from zero compilation.

I like that too, so I always start with just JS, but then two files in realize I really want my typings.

Now I use esbuild instead of tsc, and I have the best of both worlds.


> Now I use esbuild instead of tsc, and I have the best of both worlds.

I'm interested in this. I know esbuild can compile TypeScript to JS, but that it doesn't serve as an actual typechecker. Without tsc as a dev dependency, do you just rely on your IDE's intellisense to tell you when there's a type error?


I still install tsc, but I don’t actually do the type checking except at release time (and whatever typechecking the IDE provides through the language server).


Very interesting project! The source is inspiring, it seems the author has made an explicit choice to minimize dependencies.

Even transpilation from ES modules into commonjs is done by the author himself:

https://github.com/porsager/postgres/blob/master/transpile.c...

I’m not sure if I’d made the same choice, but it’s fun the see it can work out great :)


Manipulating source code via regex... I wonder what could possibly go wrong.


With NPM being such a security nightmare everytime you rope in another dependency, and myself being a long-time user of postgres.js (and much of porsagers' other work in the Node community), I do think this was the right decision.


The results are run through the same tests ;)


Tbh I think this is quite risky to mess it up. Using a simple regex to convert…

https://github.com/porsager/postgres/blob/master/transpile.c...


What a coincidence seeing this on HN. Great lib! I actually experimented with replacing our use of node-postgres with this today. Your lib was the reason I finally got around to shipping support for tagged template literals in Imba today (https://github.com/imba/imba/commit/ae8c329d1bb72eec6720108d...) :) Are you open for a PR exposing the option to return rows as arrays? It's pretty crucial for queries joining multiple tables with duplicate column names.


Thanks a lot! What a coincidence - and perfect timing since v3 supports .raw()[1] to receive results as arrays.

I'd also be very curious to hear how replacing pg goes :)

And also good job on Imba! I'm a really big fan of stripping syntax down to the bare essentials, and what you've done with Imba is really impressive!

[1] https://github.com/porsager/postgres#raw


Brilliant! We're well on our way to migrating. The experience has been buttery smooth so far. And the codebase itself is really well organized. Huge thumbs up!


Wow - thats awesome! Thank you!


I spent some time today trying to replace pg but I ran into an issue. When using an rds proxy with iam authentication, it seems to repeatedly retry authentication and eventually my lambda functions time out. If I switch to using regular credentials it works fine. Using the exact same options with pg+iam authentication also works fine which leads me to believe it's an issue with this project.

I'll open an issue on GitHub tomorrow.


That's interesting - haven't heard of that issue before, so do please create an issue on github or let me know if you figure it out ;)


Be very interesting to know if it is the rds proxy that is causing the problem. It is one things that annoys me about rds and aws services.


It was indeed the RDS Proxy. It appers to be very strict about the client_encoding parameter. Postgres.js was sending 'utf-8' which PostreSQL will understand, but RDS Proxy would just hang until this was changed to UTF8 (uppercase with no dash).

https://github.com/porsager/postgres/issues/288


Here is the bundlephobia of this: https://bundlephobia.com/package/postgres@3.0.0

  33.6 kB Minified
  12.5 kB Minified + Gzipped


Hey - that's pretty nice :) Didn't think of trying that - curious if it includes the cjs and esm version in the sum, since that's double of almost identical code.


Oh you think it should be just ~ 6kB min.gz? I think the site looks for `package.json` and find es6 style entrypoint. It should only build from a single entry.

---

EDIT: according to https://github.com/pastelsky/bundlephobia/discussions/546 they say it looks for `main` entry.


``` const users = await sql` select name, age from users where age > ${ age } ` ```

Do template literals like this in JS work differently than just straight up string interpolation? As a rails dev this set off alarm bells.


Yeah, when a function is called as a "tagged template literal"[1] the function takes control over how the parameters are handled. Postgres.js uses this to replace the value with $1, $2, etc and send over the value as parameters to the database, thereby preventing any chance of SQL injection[2].

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe... [2] https://github.com/porsager/postgres#await-sql---result


nice!


great to use template lilerals, but any way to use in perpare statement in order to escape sql injection etc type of stuff?


Just saw this on the page: >Parameters are automatically extracted and handled by the database so that SQL injection isn't possible. No special handling is necessary, simply use tagged template literals as usual. Dynamic queries and query building can be seen in the next section. // todo

this is great! Can't wait to give a try.


How can this be so much faster than even pg-native that uses the C library libpq?

https://github.com/porsager/postgres-benchmarks#results


It leverages pipelining and implicitly creates prepared statements.

And I've spent quite some time optimizing the js code and parsing.


Do MySQL!


Why not give it a try yourself? Building a database library is an educational project, and you can give back if it turns out to be a quality one, too.


> How can this be so much faster than even pg-native that uses the C library libpq?

the c++ to javascript membrane in v8 is expensive to pass. there are many projects that are faster implementing protocols in javascript vs c/c++ because of this.

redis is another.


This can’t be overstated. Crossing the JS<->native barrier is inherently a perf hit. I’ve measured it trying a large variety of high performance messaging protocols and postMessage with structured clone nearly always wins. If that sounds like a bold claim, there are troves of Node and Deno issues where they’ve improved perf by staying in JS specifically because calling into native and back has been the bottleneck. There are countless similar issues in V8 and WebKit as well.

You can get a perf boost with WASM or NAPI, but only for workloads which are CPU bound and where you expect to do that work in compiled code.


I responded to the OP, but I'll add here too for the conversation. Where are these projects that are implementing network protocols, where the nodejs versions are faster than the native ones?

As I noted in my other comment, in -any- of those implementations, you're still going to be required to traverse from libc somewhere to nodejs (even if it's just to read the network data out of the socket and send it to v8). The performance gains would have to either come from a faster (JIT'd) protocol parser, or from eliminating additional FFI calls (that it isn't obvious why they'd exist).

So, would you mind linking to one or some of those projects? I'd love to see what happened to the implementations.


> The performance gains would have to either come from a faster (JIT'd) protocol parser, or from eliminating additional FFI calls (that it isn't obvious why they'd exist).

or, as I noted above, and will try to simplify here:

let's suppose you have a buffer which contains 5 key/value pairs, and you want to convert that into an array of 5 javascript objects in v8.

1. obtain a pointer to the isolate 2. create an array object, and check 3. create 5 objects, and check each one 4. create 5 key objects, and check each one 5. create 5 value objects, and check each one 6. convert each key into a v8 typed object, and check each one 7. convert each value into a v8 typed object, and check each one 8. attach each key to each object, and check each one 9. attach each value to each object, and check each one 10. attach each object to the array, and check each one 11. check that array one more time 12. done

vs.

1. obtain a pointer to the isolate 2. create a buffer, and check 3. parse in javascript

inside of v8, creating a javascript object inside of javascript is much faster than doing the same via c++, because of the additional checks that are needed each time you cross that barrier.

    Local<v8::Value>::New(isolate, String::NewFromUtf8(isolate, t).ToLocalChecked());
is an example of simply creating a string, not assigning it into an object.


Hey Jerry!

I'll respond here so we don't have split threads :). I'm also quite familiar with at least the older versions of the node internals (I too maintained a popular db binding for a number of years) and I'm very confused by the way you're positioning the operation of the v8 vm in these 2 scenarios. Sure, the c++ is going to require you to do some sanitizing as you force your data into v8, but as we noted that's inevitable no matter how you slice it. After that though, even once in javascript land, you're still crossing these barriers constantly to allocate data and objects and memory, etc. You don't just end at 'parse in javascript', the virtual machine is going nuts calling into this same c++ codebase. Now maybe in some cases the v8 internals offer some advantages the generic c++ api can't access, but this argument isn't convincing of their existence so far.

My memories of the redis client is different than yours so I'd be quite interested to see those conversations / benchmarks. From what I recall those early advantages in the js redis client were similar to the ones we're seeing here, ie: better pipelining of commands.

As a simple thought experiment, in the scenario you're describing we should see a javascript implementation of a JSON parser to beat the pants off the v8 engine implementation, but this doesn't seem to the case.


> Sure, the c++ is going to require you to do some sanitizing as you force your data into v8

it's not just sanitizing, there's a lot more to the object creation inside v8 itself. but, even if it were just sanitizing, that mechanism has become a lot more complicated than it ever was in v8 3.1 (timeframe around node 0.4) or 3.6 (timeframe around node 0.6). when interacting with c++, v8 makes no assumptions, whereas when interacting with javascript, a large number of assumptions can be made (e.g. which context and isolate is it being executed in, etc).

> but as we noted that's inevitable no matter how you slice it.

yes, from c++ to javascript and back, but when you need to make that trip multiple times, instead of once, that interchange adds up to quite a bit of extra code executed, values transformed, values checked, etc. sure, banging your head against a wall might not hurt once, but do it 40 times in a row and you're bound to be bloodied.

> Now maybe in some cases the v8 internals offer some advantages the generic c++ api can't access

by a fairly large margin, as it turns out, especially as v8 has evolved from the early 3.1 days to the current 9.8: 11 years. there has been significant speedup to javascript dealing with javascript objects compared to c++ dealing with javascript objects. see below.

> My memories of the redis client is different than yours so I'd be quite interested to see those conversations / benchmarks.

super easy to find, all of that was done in public: https://github.com/redis/node-redis/pull/242 - there are multiple benchmarks done by multiple people, and the initial findings were 15-20% speedup, but were improved upon. the speedup was from the decoding of the binary packet, which was passed as a single buffer, as opposed to parsing it externally and passing in each object through the membrane.

> As a simple thought experiment, in the scenario you're describing we should see a javascript implementation of a JSON parser to beat the pants off the v8 engine implementation, but this doesn't seem to the case.

that's a bit of a straw man argument. especially given that JSON.parse() is a single call and does not require any additional tooling/isolates/contexts to execute, it's just straight c++ code with very fast access into the v8 core:

    Local<v8::Value> result = Local<v8::Value>::New(isolate, JSON.Parse(jsonString));
but, let's take your straw man a little further. let's suppose that all of the actual parsing is done for you already, and all you're doing is iterating through the data structure, creating objects through the c++ api, and calling it good. that should be faster than calling the c++ JSON.parse(), shouldn't it? since we don't have to actually parse anything, right? no, it's actually much slower. you can see this in action at https://github.com/plv8/plv8/blob/r3.1/plv8_type.cc#L173-L60...

again, we're not talking about whether javascript in an interpreter is faster than c++, we're talking about whether v8's api causes enough slowdown that some workloads that require a lot of data between c++ and javascript are slower than the same workload that requires very little data between c++ and javascript ... because passing through v8's c++/javascript membrane is slow.


I can’t link to any projects, implementing network protocols per se isn’t what I focused on researching this. My focus was on optimizing postMessage between threads. They’re not that different but the important distinction is that the user-facing API is in JS, not a network boundary.

My hypothesis was that:

1. Converting to binary data

2. Using facilities for shared memory

Would yield better performance in a language well suited to de/serialize binary data, with a message encoding designed for performance. The latter does perform better! If you need to spend a lot of CPU time wrangling binary data, it’s a clear win. If you have a workload with many small messages, they invariably slow down compared to structuredClone. If your workload can avoid crossing the boundary that’s a clear win still. But the moment JS VM values need to get shuttled around, structuredClone is specifically optimized for structuredClone-able values, and optimizing the JS/native boundary for data which benefits from that is an extremely narrow edge case. The only way I’ve found to win is not to play.

That said I don’t have the cleverness a lot of performance geeks have and maybe there’s some technique I’ve missed! But I honestly can’t imagine how I’d optimize JS/native interop better than the JIT without finding myself getting surprising new career opportunities on a VM team.


I think you're right. Here's a project proving the exact opposite of native bindings being slow - https://github.com/uNetworking/uWebSockets.js


I think you might find that a large part of their speedup is by not using any of the openssl trappings of node, and using boringssl instead.


On the surface I'm not sure this explanation passes the smell test. Almost irrespective of how you get the data from your network card into v8 / nodejs, you're going to be crossing c++/v8 boundaries. It's possible that a native binding goes through this expense multiple times, but that's not likely now most of these bindings work (pg at least for sure).

As the author noted elsewhere in thread, the performance gains are mostly coming from prepared statements and pipelining which seems a bit of an apples to oranges benchmark as most of these native drivers will support similar features, just not enable them transparently like this library does.

Out of curiosity, do you have links to these other projects where they have similar benchmarking attempts/results?


> On the surface I'm not sure this explanation passes the smell test. Almost irrespective of how you get the data from your network card into v8 / nodejs, you're going to be crossing c++/v8 boundaries.

yes, you are, but the differences are the object creation that occurs. a single buffer coming from c/c++ (a socket, let's say) can be parsed and turned into a large number of objects in javascript much more quickly. yes, you're passing through that barrier once, but creating all of those objects from c++ and passing through it 20-30 times is a lot more expensive.

> Out of curiosity, do you have links to these other projects where they have similar benchmarking attempts/results?

how about pg vs pg-native? https://github.com/porsager/postgres-benchmarks#results

and unfortunately, I cannot find the original discussions from when node-redis went from native to pure javascript, but it was about a 30-40% speed increase originally if memory serves (I was the one who did that original conversion after a lot of deep dives into v8 and performance crossing the barrier).

as an aside, I'm also the maintainer of plv8, and am happy to discuss the same types of performance issues of dealing with jsonb vs json (which in Postgres is text): creating objects vs a simple JSON.parse() in c++ is a significant difference.


I’m curious, how does this project compare to slonik that is also a fast, lightweight, full featured, using SQL template literals etc ? Is there any reason to consider switching, for example?


There is quite a big difference, but I'll highlight some of the main points here. Note I'm the author of Postgres.js so I'm obviously biased.

Slonik is a wrapper around another node driver (pg), so it's performance is the same as that, where Postgres.js is significantly faster (2-5x)[1].

Postgres.js is also a zero dependency module, whereas Slonik has quite the dependency graph meaning - compare https://npmgraph.js.org/?q=slonik with https://npmgraph.js.org/?q=postgres. That makes it more difficult to audit the code and preventing chain attacks etc.

Slonik also doesn't have the same lean straight forward developer experience as Postgres.js - again I'm biased saying that ;)

Postgres.js also does things that will make your queries perform better out of the box by implicitly creating prepared statements and using pipelining.

I suppose you could build Slonik on top of Postgres.js instead of pg as well, but that would probably only make sense if migrating to Postgres.js.

[1] http://github.com/porsager/postgres-benchmarks#results


> Postgres.js is also a zero dependency module, whereas Slonik has quite the dependency graph meaning - compare https://npmgraph.js.org/?q=slonik with https://npmgraph.js.org/?q=postgres.

This one just made my day. Thanks. I remember trying to build a tool at work with as little as possible dependencies (in python) and how satisfying it was to see quite a few dependencies just being wrappers replaces with 5 lines of my own code that i could easily audit and ensure no supply chain attack was possible for that functionality.


Excellent work as usual, Rasmus! :) Loved my time with postgres.js in past projects, looking forward to checking out v3 in an upcoming project of mine. The speed and lack of any dependencies, while being so full-featured is really attractive.


Hey keb_ :) Thanks a lot!


Is raw using binary protocol? (for ex: send\receive UInt8Array Buffer without converting to hex string for bytea column)


No, that unfortunately didn't make it into this release. I want to add that in a future update, first at least for bytea, but then also for other types that benefit from it.


This is the example of a great nodejs library. Congrats on v3!


Looks good from what I see. How does this compare with slonik?

Also what do the error messages look like?


Slonik is a wrapper around another Node.js postgres driver (pg). I guess slonik could be built on top of Postgres.js, but I'm not sure what the benefit would be since you get the same and more from Postgres.js. If you have an existing project using slonik it might benefit from the better performance if replacing pg with Postgres.js underneath (https://github.com/porsager/postgres-benchmarks#results)


Man I've been surprised with the v3 updates in my inbox in the past few days (subscribed as watching in github repo).

Thanks for your work @porsager!


This looks really nice.

The template strings and querying work somewhat similarly to my own library, Zapatos[1] (but that sits on top of the pg package and is TypeScript-specific, being more focused on having everything typed).

[1] https://jawj.github.io/zapatos/


Wow this is very cool and comes with a perfect timing for me since I am starting a new project. Thanks a bunch!


This looks great! I really like the idea of using tagged template literals for queries.


Very cool!

Would you consider this production ready ? I rarely if ever use relational databases for my personal projects, but this could be a great fit at work.

Also , anyway to add an orm layer. I use Firebase right now, so I'm use to that vs writing queries.


Thanks. Definitely production ready, and I've been using it for quite a while in production myself, towards this release.

About the ORM layer I wouldn't know, since I much prefer to avoid that at any cost.


Ok, I'd have to double check to see if the Unlicense is ok for work projects, but this is a great contribution to the node ecosystem.


I actually replaced WTFPL with Unlicense since someone at google couldn't use it, so I would think it's ok?


Any reason you can't use MIT.

For hobbyist projects Unlicense is great, but it may create issues with bigger companies. I think the main sticking point, is public domain that doesn't really exist in countries like Germany, so the unlicensed is unclear. Companies don't like unclear things.

Facebook uses MIT, and their tools like React and Jest are cornerstones of the NodeJS ecosystem.

https://github.com/facebook/react/blob/main/LICENSE


The logo is amazing!


Thanks porsager. Any plans for Knex support?


> Any plans for Knex support?

That question doesn't really make sense. I'm not familiar with this library, but the developer interface is very similar to slonik. The whole point of this is that SQL tagged template literals replace the need for a query builder like Knex.

Here is a good blog post from the author of slonik explaining: https://gajus.medium.com/stop-using-knex-js-and-earn-30-bf41...


I'm broadly a fan of just writing SQL, but Knex does have one important advantage: it automatically declares the return types from a query based on my DB structure. If I write SQL by hand, I have to annotate the query with the return types, and those annotations can get out of sync with the query.


Postgres.js does the same, transparently. No advantages of Knex here.


Slonik has a typegen package that works great, https://www.npmjs.com/package/@slonik/typegen


Yeah, I think it could make sense for performance reasons to make a version for Knex, but since I try to stay away from ORMs as much as possible I'm probably not the right one to make it ;)

I also wouldn't put it in the core library anyway, so it should be straight forward for someone else to make a knex-postgres package that does this?


Knex is a query builder, not an ORM :) Would love to replace pg with your library for the performance gains!


wow i really love this. will try it out tmrw in a new greenfield project.




Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: