Hacker News new | past | comments | ask | show | jobs | submit login
What’s so exciting about Postgres? (changelog.com)
118 points by jerodsanto 87 days ago | hide | past | favorite | 130 comments

This was a really good article, and thank you @jerodsanto for submitting it. I learned probably 6-7 big things about Postgres reading it that I didn't know before.

I won't list out those things, because, I'm sure it wouldn't be relevant to most people reading this comment. :). But, there are a couple things I noticed that I do want to mention:

> There were these edge cases in there, where it’s kind of like less safe. And Postgres was there, and it had a reputation of being just rock-solid, consistent, and more strict with your data

> "... I never had that moment where I’m like “Oh, Postgres, you screwed me over.”

This is a huge deal, and, IMO, the reason to choose Postgres over MySQL. MySQL has some horrendous defaults. You can make things a little better, but, AFAIK, you can't even get to where Postgres starts out of the box. Most of that consists of MySQL being extremely permissive with what it allows you to do in your SQL. Many of those things are non-standard, and some are quite unsafe. Postgres not only prioritizes safety, it keeps the application developer honest.

It can be a little complain-y, but I'd rather have my database complain than do stuff that could lose data. One great example is when you try to stuff more characters into a text field than it can accommodate. By default, MySQL just truncates the string silently; Postgres just doesn't allow it.

> PostGIS is a huge one. It’s a whole geospatial database.

This is the other point I want to call out. I've had the pleasure of working with PostGIS before. The main thing I'd want to say about PostGIS is that, although there are issues working with geojson and GIS in general, PostGIS doesn't exacerbate any of them. Generally speaking, PostGIS is about as pleasant to work with as a geospatial database can be.

> Most of that consists of MySQL being extremely permissive with what it allows you to do in your SQL. Many of those things are non-standard, and some are quite unsafe. Postgres not only prioritizes safety, it keeps the application developer honest.

This is incredibly important: if you use MySQL, unless you are extremely pedantic about using safe defaults and a safe-by-default ORM you will almost certainly end up in the position where migrating will be hard because you are going to need to rigorously test & remediate all of the areas where your code was relying on MySQL allowing erroneous code to run without errors and nobody ever looked at the warnings.

I've even seen a few cases where people had data loss which had either gone unnoticed or been sporadic enough that they assumed it was user error or random corruption.

> I've even seen a few cases where people had data loss which had either gone unnoticed or been sporadic enough that they assumed it was user error or random corruption.

I've gotten to troubleshoot this specific condition myself. Amusingly, the big thing that saved us was that we could use lingering redundancy in a poorly normalized schema to rebuild some relationships that should've been enforced by the database.

> I've even seen a few cases where people had data loss which had either gone unnoticed

Maybe under MyISAM, but InnoDB has been the default storage engine since 2010.

Under InnoDB, because we’re not talking about file corruption but cases where MySQL would silently and irreversibly discard data rather than throwing an error. For example, inserting a value which is too large for the target data type - it truncates the value so that value is lost unless you have another copy or way to reconstruct it.

People always criticize MySQL, then when asked what is the problem, they always mention the same thing about column length, which is just a default that can easily be changed and it's not even the default anymore for years.

That’s the easiest thing to mention but it’s far from the only one I’ve encountered from using MySQL since the late 90s. The reason things like this keep getting mentioned is that they keep breaking databases: it’s like trying to excuse all of the salmonella cases at a restaurant which serves undercooked chicken because you can remember to order it fully cooked.

In 1998, MySQL was appealing because it was very fast on simple queries and didn’t cost a further. Until the early 2010s it had a better replication story than Postgres. But by now it’s often slower than Postgres on my apps, feature poor, and you don’t notice how many limitations you’ve internalized until you realize it’s been years since you had to repair or kludge around a MySQL quirk.

> For example, inserting a value which is too large for the target data type - it truncates the value

I commented in another thread about this. It's not black or white whether that's a bug or a feature since:

1) not truncating causes the whole row to be rejected in Postgres, or with the relevant MySQL server setting

2) for many SaaS and social media apps, truncating that column type is fine, since typically that's a description field

3) auto-truncation means you don't have to sync your app and database schema after every change.

I prefer and rely on MySQL's behavior actually, even though I work with both MySQL and Postgres.

Source: DBA.

I'm trying to think of a delicate way to put this but I'll be blunt: when you put “Source: DBA” my reaction is not “oh, an expert who is correct when everyone else is wrong” but “wow, a DBA who doesn't care about data integrity? It's like finding a doctor who doesn't wash their hands.”.

As someone who's trained a bunch of DBAs, here's how you should explain that this is a bug:

A database's job is to reliably store and retrieve data: that's why we talk about ACID since the whole concept is built around moving from one correct state to another. If we didn't care about getting back what we stored, we could just write flat files on disk and shrug when it breaks. Silently discarding some of the input breaks that contract.

Similarly, a core part of the SQL standard is about the database ensuring that values match the defined schema constraints. If we're going to allow silent truncation, the same reasoning would allow inserting foreign keys which don't exist, coercing non-numeric values into NULLs or other ways of “handling” non-numeric data types, replacing invalid dates with "0000" (oh, wait…), ignoring CHECKs (which MySQL did until 8.0.16), etc. If we're not doing that, we could just use MongoDB and stop pretending to be more than a blob store.

Rejecting the entire row is exactly what a DBA should want because it means that a) the database only contains the valid data it promised to save and b) the application team is immediately alerted to a mismatch between what their application assumes and how the database schema is actually configured — since there's been a breakdown in the migration process, there are likely other problems as well which might be more subtle. Especially in this century there's no excuse for not using a migration framework which makes the process of keeping the database schema in sync with the application automatic and reliable.

In one of the rare cases where data truncation is okay — and note that I've never encountered one of those in 3 decades where the users saw it as anything other than “This garbage software doesn't let us store more and that's why our department really runs on the Excel spreadsheet one of the admin assistants created” — the correct approach is to implement it at the application level so the user can see a fixed-length input field, warnings if they try to enter more, etc. and the application doesn't produce odd errors when, say, it stores a value and attempts to retrieve the same value only to be told it doesn't exist.

I think you are getting lost in the details. Databases are inherently complex and there are sharp edges as well as tricks you need to learn for each one. Beyond that row stores like MySQL and PostgreSQL are essentially interchangeable. The biggest technical success factor is whether you understand the DBMS well enough to use it effectively. The biggest business success factor is licensing: MySQL is GPLv2 and PostgreSQL is BSD.

Specific features are somewhat interesting but at this point not many teams are starting from scratch on MySQL or PostgreSQL. It's generally the case you already have experience on one or the other or both.

> I'd rather have my database complain than do stuff that could lose data. One great example is when you try to stuff more characters into a text field than it can accommodate. By default, MySQL just truncates the string silently; Postgres just doesn't allow it.

That's your opinion, but the truth is not necessarily black and white.

Postgres will probably lose more data in your case when the application developer doesn't handle the exception for a text field too long, and loses the whole row. (If you're an application developer, you should audit your code now.)

A well-written application using Postgres would have to truncate the data before insert, achieving the same result as having MySQL auto-truncation, but with more initial and on-going maintenance effort in the case of Postgres.

And the existence proof that MySQL's behavior is ok is that MySQL is the most widely-used relational databases, and powers most Internet sites.

Source: DBA.

Here's the deal: I am not presenting my opinion as objective fact, nor do I actually care that you're a DBA. If anything, the fact that you are a DBA makes your opinion less relevant to me as an application developer. Simply put, you do not share my pain, and the people who have upvoted me, while your comment is currently gray as I write this, have.

The example was intended to show a scenario in which MySQL silently loses data. There are certainly situations in which this sort of data loss is acceptable. I've been involved in scenarios where it was not acceptable, I was surprised by the behavior, and I ended up with a bunch of missing data.

The problem here is that I was surprised by my database. That should literally never happen. The correct behavior in this specific case is for the database driver to raise an exception or warning (exactly which should probably be a configurable option, defaulting to an exception). That way, if an exception gets raised in my application, at the very minimum, I have a traceback in my logs to tell me something went wrong, which allows me to fix the issue sooner rather than later.

As for your existence proof that "MySQL's behavior is ok," the fact that MySQL has a bigger installed base literally proves nothing. I guarantee you there are folks out there who are experiencing such behavior and being surprised by it, just like I was. There are people out there for whom this specific behavior is not a problem. And, there are people out there who are length checking every bit of text they try to stuff into a VARCHAR(n) or CHAR(n) field using MySQL, because they started with MySQL and they're basically stuck with it, unless they want to do major surgery on their app.


Guess you're out of actual arguments now that you've resorted to name calling.

Edit: I notice a lot of gray on your profile page under comments. Do you need help? This is a genuine offer. My email is in my profile.

> Postgres will probably lose more data in your case when the application developer doesn't handle the exception for a text field too long, and loses the whole row. (If you're an application developer, you should audit your code now.)

The thing is, you'll noisily lose your data, which is far more likely to be caught and corrected. Even the most roughly put together web app is probably going to have some mechanism to alert about server-side production errors.

> A well-written application using Postgres would have to truncate the data before insert, achieving the same result as having MySQL auto-truncation, but with more initial and on-going maintenance effort in the case of Postgres.

Not necessarily - you could alert the user that they need to provide less data, or mitigate it in some other way (maybe storing a summary and the full version in a blob somewhere).

Your example is outdated, MySQL doesn't truncate by default in later versions. Defaults can easily be changed. Choosing one database over another because of defaults is not a good strategy.

The point still stands, however — here's an incomplete list of everything you have to worry about with a MySQL database which you don't need to worry about with most other databases:


(I believe there are also still issues with Unicode, too)

I ran into a fun one a few weeks back: code which had been running for years failed on an AWS Aurora cluster because they'd defined a field as BIGINT and were inserting UUID_SHORT() values into it. For some reason, Aurora always generates values with the first bit set and so they learned the hard way that that field needed to be declared as unsigned.

Yes, the defaults can be changed but many, many people do not know this and will only learn why they need to after they have something blow up painfully. Often, even people who know this in theory will forget about it at some point when setting up a new one. It's the same reason why the industry is trying to get away from services like MongoDB / ELK listening on with no password or having a default password which can easily be scanned. For something as widely installed as MySQL, even a 10% chance of oversights will mean a LOT of collateral damage.

Fine, my example is outdated. That's irrelevant. I could probably find 3 more examples in the current default config for MySQL that would be as bad or worse.

As for the rest of your comment, sensible defaults most certainly are a valid reason to choose one piece of software over another. Bad default configs are equivalent to what the Dungeons & Dragons community calls "newbie traps." They're things that either don't look important, or don't look like they'll hurt you, but really are important, or can hurt you, if you're not extremely careful. What's more, some of these options are probably not going to be backward compatible with your code. So, you get into this situation where MySQL is doing shitty things you don't want it to do, and, when you find out later that it's doing those things, you either can't do much about it, or doing something about it involves major surgery to your application. That's bad.

So, yes, you can configure MySQL to have better than default behavior. That's no surprise; that's literally the point of having configuration options, after all. The problem is that some of these options don't look too bad. There is something to be said for having a permissive configuration. But, for the reasons I spelled out in the previous paragraph, that shouldn't be the default configuration. It's not too much to ask that you should be able to install and configure a database and not have it lose or corrupt your data.

PostgreSQL isn't exciting. It's boring, with roots in the Berkeley Ingress project from the mid 80s. Which is why 10 years ago, so many HNers went with trendy NoSQL data stores instead, even though they lost data. Data loss is very exciting, and PostgreSQL is very boring. This preference for what's exciting and trendy over what's old and proven is a large reason why so much software today sucks.

That seems unfair.

NoSQL was trying out new ideas in data storage. It was exciting to try out new or re-imagined core concepts, and some of those young projects had teething issues. But several are still around and remain popular, but they're popular for certain niches they excel at (and those niches were largely discovered through trial and error).

In the SQL-sphere a lot of people skipped Postgres because MySQL had, at the time, the momentum in the free/cheap relational database space. Between then and now Progres has grown more elegantly than MySQL, and people are rightfully looking to it.

> NoSQL was trying out new ideas in data storage

Was it? Or was it just rehashing old ideas made obsolete by relational databases?

> https://en.wikipedia.org/wiki/IBM_Information_Management_Sys...

> Initial release 1966; 54 years ago

> https://en.wikipedia.org/wiki/Category:NoSQL

MySQL was also a lot faster in the early days.

Plus it “just worked” not sanely mind you. But you had a lot less problems with type mismatches, up/downcast, etc. Well unless you wanted quality data. But back then we were just happy that it didn’t bother us with “minor” details.

It was a lot faster, and sometimes the odds that your data might live long enough for those integrity issues to matter seemed very remote.

Around 2001, the conventional wisdom was that it was worth the risk to take the performance victory, keep your hosting bill down, use mysql, and build some other approach to data integrity. (If memory serves, the integrity downsides were bigger then, too...)

The math is pretty different now. I'm not sure it's fair to hang all of it on Oracle, but it feels like that acquisition derailed some MySQL progress that might've made it a close call today. Now, I use Postgres unless I have a very specific, very compelling reason not to.

Apart from being the most mature relational database, I would argue the code that runs it. It's pure poetry. I am yet to see code this clean anywhere. Just opening any source code file at random[1] will show you exactly what I mean.

[1] https://github.com/postgres/postgres/blob/master/src/backend...

Let's not get carried away. It is very much debatable which database is "most mature".

Here is some discussions about how Postgres compares with Oracle DB (if $ is not an issue), it is often said that Postgres is a Cessna whereas Oracle DB is like a Fighter Jet: https://news.ycombinator.com/item?id=24582937

I have been working with postgres for multiple years and in one project I needed to work with oracle. I actually ran into a bug in oracle db and they did not listen to us because we where a small company. We where working for a very large customer and asked that company to complain on our behalf at oracle. They they finally fixed it. Project was a large GIS project. I never had a say in the choice of database otherwise I would have chosen postgres. Oracle regularly crashed on complex spacial queries. At some point I left te company just to stop trying to fix that product. I think the only way of fixing the problem was a rewrite of the product that would use postgres. I hope I never work with oracle products in my life again.

I've never had the opportunity to test Oracle DB at large, just some minor playing around with 12c many many years ago so I don't have an accurate view. That said, given my experience with other Oracle products and the fighter jet analogy... De omnibus dibutandum est...

Fully agree with this. I once heard it stated by someone "I don't like writing C, but I don't mind at all writing Postgres C". It is definitely a quality codebase and the credit goes to a pretty small set of maintainers over the years.

I started using PostgreSQL because, at the time, MySQL wouldn't do subqueries. The fact it silently truncated VARCHAR data to the size of the column and didn't give out an error didn't help much.

Overall, my impression is that the project's philosophy is to do the Correct Thing, even if it's slow.

Few years back I've had friend explain the difference between MySQL and PostgreSQL like this:

- MySQL: do it fast, then make it right - PSQL: do it right, then make it fast

I had switched after learning that CHECK constraints were parsed, but ignored -- I've yet to be as offended by a program that at that instant

Apparently supported now but what the hell; I don't think they even threw a warning about it

Years back, the conventional wisdom seemed to be that MySQL used to be faster, and PostgreSQL more correct. Over time, according to that same conventional wisdom, MySQL got more correct than it used to be and PostgreSQL got faster than it used to be.

Databases shouldn't be exciting. They should just work.

The main reason webcrap works is because the databases underneath work. If the Javascript crowd had to manage data storage, nothing persistent would work reliably.

> Databases shouldn't be exciting. They should just work.

With enough experience with the alternative, things that just work are exciting.

And it's wonderful to work with someone whose stuff just works


Hear, hear.


PG steamrolls every other general purpose database, and has tons of extensions, and frontends so you can use it in any database paradigm.

With 30m minutes of time, you can turn it into a document db, log/time series db, columnar, graph, key value, whatever

I bet we'd both seriously consider (if not prefer) sqlite over postgres in a bunch of scenarios?

I love PG and sometimes I see some cases where maybe Sqlite would work "better" but just as soon as it starts to get a little bigger I start missing all the cool stuff that PG has.

Sqlite is still awesome but PG is more awesomer. And since the cost is the same I just stick with PG now. Unless I'm resource constrained (which for my work is rare)

SQLite is amazing, and “lite” is really not fair. I’ve seen it handle 1tb+ with at least decent performance.

It’s not a replacement for a real DB but it is damn close.

Actually as I understand it (and a couple of minutes Googling doesn't back up my case) the "lite" on the end is pronounced "ite" and refers more to SQL as a geology like term. In other words, the SQL is a rock.

Even if that was what the name was intended to be, the vast majority of people pronounce it as SQL - lite. (In fact, this is the first I've heard of someone pronouncing it that way).

The documentation seems to suggest that it is in fact pronounced "SQL-lite": https://www.sqlite.org/fullsql.html

> Don't be misled by the "Lite" in the name.

It's not, because it is a real DB. It's just one for single user access only.

> Think of SQLite not as a replacement for Oracle but as a replacement for fopen().


Well sure, it’s better be thought of as a replacement for fopen() than one for Oracle. (Unless you were thinking of buying Oracle in order to use it on your desktop.) An even better way would be to compare it with MS Access, for example.

sqlite is a wonderful, amazing, file manipulation API. If single-threaded file manipulation is what you need, its great. But is not an application backend in the way PG can be.

  > If the Javascript crowd had to manage data storage,
  > nothing persistent would work reliably.
There are different incentives to writing UIs than there are when you're writing a backend. Often a business needs you to iterate very quickly, so people change their practices to suit this. Some are able to get away with never raising the quality bar once they finish iterating and it's unfortunate how the industry allows this. Additionally, there are, of course, teams and companies that want high quality code, but even here quality is defined differently from how it would be for data storage logic. For instance, it often means "high framerate" and "easy to understand", instead of "optimal" and "robust".

There are less JavaScript engineers writing backend code, and most that do this are generally making simplistic CRUD Node.js apps. However, there are companies and people that specialise in high-quality complex Node.js apps, and they try just as hard as backend engineers from other programming languages to do this well. (I'm thinking of projects made by developers on the core Node team. I've seen good backend code there, perhaps because they're more aware of lower-level concerns.)

I personally write a lot of JavaScript/TypeScript code and I try to do a good job, but a "good job" is defined differently depending on the kind of product you're creating.

Edit: I just realised you are _the John Nagle_, so you likely have a more refined taste in certain kinds of programming than I do, however I think my point about incentives probably still stands.

> Databases shouldn't be exciting. They should just work.

A complicated thing which just works is exciting.

One man's exciting is another's snorefest. A lot of people would probably laugh at the suggestion that anything involving computers could be exciting.

It is unproductive to talk about subjective opinions of excitement.

The GP was referring to "Databases shouldn't be exciting" - not in the traditional sense of the word, but a metaphor that hinges on "excitement" being new features, frivolous decorations, unnecessary tangents, losing vision of what a database should do, frantic patching and heated bikeshedding discussions.

It wasn't meant as "Ooooo I am excited about a database".

I think you're restating my point.

A lot of people imagine databases that way, but they are complex beasts with a lot of features. Everyone seems to say features don't matter, except of course for the ones that they use. Which is often a growing set as your application matures.

All of these features are exciting to the people that can use them.

It is often downvoted to talk about the "javascript crowd" on HN, but I've worked in software industry long enough to unequivocally and with certainty say that it totally lives up to the cliches.

It never gets old to scold people that are not serious, disciplined and principled in their endeavors that affect the rest of the world. Javascript community absolutely deserves the avalanche of criticisms they face.

It is the reason why the modern web sucks, and frankly - I wish folks at Postgres would develop a new web standards including browsers that are not governed by corporations and can be implemented in a couple of weeks. Not trivializing what goes into building browsers, but making the whole standard small enough so any one can build a browser during a sabbatical.

I'm seeing this in action now, the difference between the backend code and the frontend code is stark and the backend code isn't up to my standard either (and my standards aren't that high, I know I'm merely competent for many places).

It just seems to be the nature of the web side of things, lots of sloppiness.

The devs are smart enough, they just don't think and it shows.

That or they use something but clearly don't understand the thing they are using.

I replaced nearly 500 lines of code with ~60 by taking a bunch of massive conditonals re-ordering them then extracting them into methods and returning on the simple case.

This isn't voodoo, you have to work harder and remember more to do it the wrong way.

Also comments, they don't seem to understand you document why you did the thing, not what you did, what you did is in the code, I don't trust your comments to tell me what you did, I want to know why you did it.

That and general documentation (cross link stuff, link at the top of the class to the documentation in the wiki, update the documentation when you get a ticket that changes the behaviour, write the high level flow in plain english a 10 year old could understand).

rant over.

I actually like my job, the people are lovely and the work is interesting its just frustrating when an ounce of thought saves a pound of cure.

Hello fellow "merely competent" engineer. I recommend we both remove "merely" from our title. Inexperience grows from the bottom, I've gotten "better" by simply being too stubborn to get worse. Keep up the mediocre work!

> Inexperience grows from the bottom, I've gotten "better" by simply being too stubborn to get worse.

That is quite possibly one of the best quotes I have heard on how to approach discipline in engineering.

I don't understand, can you please explain?

> Inexperience grows from the bottom

In what way?

> I've gotten "better" by simply being too stubborn to get worse

How does one get better by wanting to get worse?

This makes no sense to me.

I read it as "refusing to get worse". Which implies that your environmnent is getting worse, making you better by comparison.

I'm not sure I agree, but I can understand the sentiment.

I agree about (and am frustrated by) everything you said here, but:

It's not just "the web side of things". Which makes it even more scary.

> That or they use something but clearly don't understand the thing they are using.

You know, I feel like this has bled into backend development too. And it's not entirely the fault of engineers... or so I like to think, to excuse myself :P

What worries me a lot is that I find myself being one of those people, not being able to fully help my situation.

It's just... too much. It seems like every few months a new thing pops up and everyone suddenly wants to adopt it. And even if you try and resist it because it's not necessary, someone from above (e.g. CTO) will decide that the whole org has to adopt that and now you're screwed.

In a typical backend application you have to deal with:

- Application framework

- Testing framework

- Logging framework

- Metrics framework

- Tracing

- Visualization tool(s) for the three things above

- CI

Which is already quite a bit, but obviously manageable and you could get pretty knowledgeable on all of that if they stay static.

But hey, now your org has to expose every service on gRPC. Ok, one more thing to tack onto your app...

Now everyone has to expose GraphQL endpoints too!

Oh and next month we're migrating to a new metrics solution. Because. Great, let's throw away all the knowledge built on the old one and scramble to migrate to the new things we don't have time to learn fully.

Alright, now let's start writing new services in this new language.

And now everyone has to throw events on this new message bus thing that some team over there built using Kafka!

Finally, let's move everything off of AWS managed services and onto Kubernetes! Yay!

All of that can easily happen within the span of 2-3 years. And in the meantime you have to deal with on-call rotations, customer issues, implementing new features... and if you're a senior dev you're also busy mentoring folks, going to meetings with other teams, etc etc.

You know what I'm talking about, right? Over time, I feel like it's becoming impossible to achieve mastery of anything. There are way too many things to learn to be able to make sense of the whole, and no time to do deep dives on all of them. And then there's all the new stuff that keeps being adopted.

And then things blow up and they're freaking mysterious and when you finally find out what it was, you realize you'd only have been able to catch that problem beforehand if you'd read the freaking source code for some library you adopted.

It's madness. It's pure hell.

Boy I miss working on low level code.

End rant.

I have been working with a guy who constantly avoids popular libraries. Constantly. And his product is constantly more accurate (factored, few bugs) and then he can document it without referring to any outside sources. It's pretty good discipline to write things yourself and only grow into libraries when you have proven you know what you need.

That is where the nuance lies, in not reaching immediately for a library when you have a problem, explore the problem until you understand the solution then you know if the library is a good fit or not.

Half my battles with my team are them wanting to pull in dependencies, sometimes it's crazy like a wrapper dependency to pull in a different dependency when it's simpler to just pull in the indirect dependency and use it directly.

It's very much a thing that comes from experience though, when you've been bitten by upstream enough to be a little circumspect about it.

All of this.

I agree with all of it, particularly "Lets throw away X and switch to Y, we lose all the domain knowledge we have with X but Y others this absolutely tiny marginal improvement, so it's worth it!"

As a lead I spend a lot of time pushing back on adding stuff until someone has made the case why we should add it.

You want to use Y, great - document why you want to add Y, what benefit it brings, who runs Y, where is the code, how many maintainers does it have, is it well tested, if want to remove Y later is it a big job, what does it offer that we can't currently do etc.

Should I assume from your comment that you never saw code worth improving other than those written by JavaScript developers?

Ah, pointless exclusion of the middle.


That’s quite the perspective.

It’s a big enough world where you don’t have to interact with every technology that exists, but this is one of the more prevalent ones - and you might find it useful to make peace with that community.

Anyway, good luck out there.

It's a stereotype, and like a lot of stereotypes, it has a little bit of statistical truth, but I don't think it does justice for the excellent front-end engineers that do exist and whom I've worked with personally.

That said, there is some truth in it. The thing about the back end is that there's no satisfaction in it other than the satisfaction of doing it well and earning the esteem of peers, so most back end developers are paying attention to the quality of the code they write. The front end has other attractions as well, and many front-end developers start out focused on those attractions, and learn by experience (if they learn at all) that writing high quality code is related to the cool stuff that got them into programming in the first place.

I didn't mean us to be divisive. Backend engineers need good front-end partners to be successful. Front-end engineers depend on the backend folk to not fuck shit up, which happens time and again.

I meant that the developer community, the folks that are at the helm of developing Javascript frameworks (gahhhh!) and other shenanigans, the entire browser based ecosystem just reeks of incompetence, lack of discipline and rigor.

Frontend and Backend engineers should get together and build a better system that serves both well. That's what I meant by asking Postgres core devs to build front-end tools, something like Gemini [1].

[1] https://gemini.circumlunar.space/

Sorry, but I can’t agree.

Yes, they are capable of making great software, but it’s exactly what gp said... they don’t think. I’ve seen so much code which looks nice from the outside, abd is all structured. But if you start reading, I’m riddled wtfs and questions about why things weren’t done in a more simple concise and stable way.

That's my experience too.

Recently had to deal with some Slack bot written in js at work, and my coworkers had to deal with me ranting about that crap for a whole afternoon.

The way I put it was, this code was written in the most complicated way possible, for no apparent reason.

I agree. We’ve also spent the last 10 years trying to make web app behave like desktop apps and they are still not matching.

Typical bitter backend developer spew. Works with Java or C# and hasn't explored much more than OO and procedural. Threatened by anything new, creative or explorative. Stuck in his ways and prefers to do mundane, uncreative and repetetive tasks, like writing boilerplate to get stuff in and out of a database. Soon to be automated away by serverless.

Typical frontend developer spew. Works with JavaScript or Emscripten, hasn't dealt with data integrity, scale, or security challenges. Threatened by people who build things from scratch. Stuck writing handlers, never to see the whole architecture. Soon to be replaced by GPT-3.

Thank you, this has been entertaining.

This is one of the funniest rebuttals on HN.

I was thinking about the exchange, and it dawned on me that GPT-3 has really demonstrated frontend generation capability. I'd absolutely love to see it build my app's data layer from description (don't call it an ORM!) but I don't even know where to start looking for that.

Ooh, permanent lock in to a single company’s bill by the second mainframe! How exciting!

I see this criticism a lot for serverless but I don’t see it in reality. Serverless is still just running code, the only part that wouldn’t be portable is the entrypoint data structure which is amounts to practically nothing.

The devil is in the details. How many little things would need to be changed, refactored, or rewritten to move to another host?

Good luck running serverless code in a lightbulb.

Last time I looked at Azure IoT Hub for anything I was baffled by the way their system used MQTT but didn't provide general-purpose MQTT access. Frankly they're not a great example of a company understanding how these devices are supposed to work.

Please explain further what negative characteristics do you apply to the ”js crowd“ as a whole.

As a developer who only develops in JavaScript, I feel personally attacked by your post, so I would appreciate if your attack could be more specific to the point of being constructive or at least informative.

> If the Javascript crowd had to manage data storage, nothing persistent would work reliably.

Wait until this guy learns what the JS in "json" column types means, and how the largest webcrap sites rely on them for persistence.

Note that 'Animats didn't comment on JS as a technology, he commented on the crowd.

In 2017 Bryan Cantrill gave a wonderful talk titled "Platform as a Reflection of Values: Joyent, Node.js and Beyond" https://vimeo.com/230142234 In it, he talks about how even though they liked JS as a technology, they came to realize that the values of the JS/Node.js community differed from their values at Joyent.

In the talk (at around 19:30), he identifies that in 2014 Joyent's core values wrt node.js were "debuggability", "robustness", and "stability"; while the core values of the broader node.js community were "approachability", "expressiveness", and "velocity". And there's nothing wrong with either of those sets values, but they're not the same!

With that lens, I'd characterize the parent comment as contrasting the PostgreSQL team's core value of "robustness" with the Javascript crowd's core value of "velocity". That's a distinction in values, not in technology.

Json has nothing to do with the persistence part.

If you need to persistently store frontend state in the backend, would it not be easier to use the native frontend data format?

It's not entirely about persistence... it's about the ease and reliability of the persistence.

My point is that that JSON isn’t being delivered via a js db but instead translated from data stored in a database written in C. So the “what about json” gotcha isn’t as good as it was made to seem.

My point is that JSON is being supported. So the "dismissal of facts" isn't as thoughtful as it was made to seem.

No it isn’t though? JSON is used, by developers, as an intermediary to exchange data pulled from a db. Your point that json is proof that the GP comment is false is not even a point but some kind of fallacy akin to a strawman.

Your inference of my intentions, as an intermediary to deflect from the facts presented, is proof that you're trying really hard to be edgy. Akin to a strawquant.

The people behind javascript provided the object notation technology that has been absorbed by Postgres. It is not only being used to exchange data (as a standard text field would be used), but can be used to QUERY data. The database is aware not only of the JSON text string representation, but of the object itself.

I’m not being edgy you took a data format and pretended it was a refutation when it wasn’t and isn’t.

What does the JS in JSON stand for? Where did it come from?

What does that have to do with JSON being an intermediary language and not the actual persistence part? What part of that don’t you understand and I’ll try to explain it for you


I understand perfectly the difference between a system and a format. You either don’t or you’re trolling (and dishing out ad hominems to boot.). Reddit would be more suitable to your posting style, we come here with a higher expectation of them being actual discourse and intellectual honesty.

Surprised they didn't cover transaction id wraparound, which has caused more than a few public outages at scale.

IIRC some Postgres folks have a plan to fix it? Maybe Craig knows?

> Surprised they didn't cover transaction id wraparound, which has caused more than a few public outages at scale.

There have been some significant improvements in the last couple years - most importantly the introduction of the freeze map in 9.6. But also quite a sprinkling of other incremental improvements.

It can obviously still be a problem, but it's not as pronounced as it once was.

can you elaborate?

> can you elaborate?

On what exactly?

The freeze map bit I referenced above is the following commit: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

    Commit a892234f830e832110f63fc0a2afce2fb21d1584 gave us enough
    infrastructure to avoid vacuuming pages where every tuple on the
    page is already frozen.  So, replace the notion of a scan_all or
    whole-table vacuum with the less onerous notion of an "aggressive"
    vacuum, which will pages that are all-visible, but still skip those
    that are all-frozen.
    This should greatly reduce the cost of anti-wraparound vacuuming
    on large clusters where the majority of data is never touched
    between one cycle and the next, because we'll no longer have to
    read all of those pages only to find out that we don't need to
    do anything with them.

This means that an anti-wraparound vacuum (automatically started, even when autovacuum is disabled) is cheaper than it used to be. Still not necessarily cheap, as indexes still need to be scanned (but see below).

Some of the additional changes were (reverse chronological order):

* 2020-03-28 - "Trigger autovacuum based on number of INSERTs" - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

This is important because it will reduce the cost of an eventual index wraparound, as there will be less work in a later anti-wraparound vacuum

* 2020-01-20 - "Allow vacuum command to process indexes in parallel." - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

As index processing can be expensive (no equivalent to scanning only changed parts of table), processing them in parallel can greatly reduce the time for a vacuum. Note that this isn't yet done by autovacuum.

* 2019-04-04 - "Allow VACUUM to be run with index cleanup disabled." - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

This can be extremely useful for manual vacuum when getting close to a wraparound, since the index processing step is not necessary to stave of wraparound.

* 2018-04-04 - "Skip full index scan during cleanup of B-tree indexes when possible" - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

This can make vacuums for pretty clean tables vastly cheaper.

There's also a number of changes that make indexes smaller / less likely to bloat. That in turn makes vacuuming more efficient. E.g.

* https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

* https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

Edit: formatting (gah, why is HN formatting so limited)

Just chiming in to say, Andres above can answer this far more in detail than I can. I suspect most people that do hit this today 1. have not nearly as a bad pain from it and 2. recover in generally much more quickly than previously. Today vs. say Postgres 9.6 is a huge difference.

awesome, that's an above and beyond answer! i really appreciate the detail!

Features I miss in PostgreSQL:

- Support large number of connections (without external pooling tools) and stop starting a new process per connection (consumes huge amount of resources). Other DBMS works with threads instead of processes and don't have any issues.

- Easy to configure high availability with multiple nodes

Less important:

- Graph capabilities directly in PostgreSQL without extentions or something else

- Running a scheduling tasks directly in PostgreSQL without external tools (for example a script that removes old data from database).

- Supporting Temporal Tables

But overall I like PostgreSQL more than Oracle or SQL Server, even if I miss the above features.

> It was really funny, Uber switched from MySQL to Postgres, and then Postgres to MySQL. So they’ve gone back and forth a couple of times, but when they’ve swapped to Postgres, they had to figure out – they had all this app logic that relied on things being case-insensitive, because the database just doesn’t respect case search.

This suggests to me that the guest here maybe doesn't know about collations? But that doesn't seem likely (possible?) given his stated experience and the roles he's held. So what gives?

Guest here. The context is Uber was on MySQL which by default doesn't respect (or at least in their version and setup at the time) case sensitivity. To MySQL craig is the same as Craig when matching.

Their app for search relied on this. The solution for them was the citext data type to create a case insensitive string - https://www.postgresql.org/docs/13/citext.html

I'm fairly familiar with Postgres, but have not heard of a collation that tackles case insensitivity without tweaking the system itself? Postgres you can tweak this at the system level, but I've not seen it in the wild and as per the docs you're now non deterministic. For MySQL it appears you can change this, but seems a surprising default for them they expected case insensitivity.

Yes, another option could have been indexing and searching on lower or upper casing, but they wanted the minimal change to their app.

There was once a conference talk they gave about their migration process from MySQL->Postgres, but I'm not immediately able to find the video online so it may have been removed.

I'm not familiar with pg, but the choice of case sensitivity is a fundamental property of collations in other DBMSes like MSSQL. Web search does make it look like pg only got case-insenstivity-via-collation in v12 (2019), and also requires you to make your own collation with case-insensitivity since it doesn't have any built-in ones with it.

It's been a while and don't remember the specifics, but when we implemented a POSIX-compliant filesystem at Maginatics (acq. by EMC), we used MySQL (Percona actually) to store the FS logic (inode numbers, filenames, etc) and case sensitivity was one of the FS features. Maybe Percona supports it? I do remember (at least at that time) that full utf-8 support was poor though.

Btw, don't get me started about case-sensitive file systems - Windows got it right (case-preserving, but case-insensitive). Why should both /home/ellen/Music and /home/ellen/music exist? :)

Because once you go there, you have to answer questions which are not as easy. Does M match m? Does i match ı? Does a match あ? Does あ match ア? Does m match 𝓂? What happens when they need to coexist?

But they're different! I see where you're coming from though (unfortunately I know all these examples too well..).

Case insensitive search is accomplished with ILIKE no?

How well does ILIKE work with indexes? In MySQL the collation (which includes case- and accent-independency options) is known to the index.

You can stipulate the collation in the query if you need something non-default.

Great practical note by Josh Berkus on why Uber left Posgresql. Basically: runaway table bloat because Uber had a usecase that postgres doesn't address as well as InnoDB.


The whole VACUUM paradigm is the biggest thing that bugs me about pgsql. The fact that it can actually freeze things always worries me. Can’t this happen constantly in the background like modern GCs?

I think you might be pleasantly surprised to get current on how vacuum works in current editions. Vacuum used to be a bigger issue for pg.

It does, though? That's what autovacuum is, unless I'm missing something.

It's just not always enough without tuning, just like GC.

We have had a lot of problems with frequently updated tables. Is auto tuning really that hard? Or is pg’s philosophy not to “auto” anything?

I learned that about postgres collations and case sensitivity about 6 years back, and it was within 3 months of my first heavy use of postgres. So I wonder if this person was just experiencing their first usage of postgres too? And I've used relational databases for more than 25 years.

I read that as Uber discovered their code relied on MySQL default case insentivity. Migrating to Postgres might have been tricky as collation is a very recent addition in Postgres.

That could be. It was the "because the database just doesn’t respect case search" part that struck me -- to say that MySQL doesn't respect case search is false, and the only scenario I can think of where it can even appear to be the case is if you're new to MySQL and you've made a big assumption without even googling it. Or maybe if you're new to string comparison (and therefore programming?) in general.

Uber once switched Postgres->MySQL because of human error and polytics for “taking remediation action”.

I like the this whole class of questions "What's so exciting/good about <SOMETHING>?"

Online (forums, etc) this type of question usually elicits informative answers, and probably in face to face communication too.

It works reliably and is not at the mercy of Oracle - so it's pretty much the winner for relational storage, unless you need something different (doc db, time series db...).

Wow, what a great episode! Probably one of the info-loaded podcasts I've listened to. Definitely going to be using psql's \e command with $EDITOR set to VS Code.

>[unintelligible 00:08:02.01] "is to go hack on"

>[unintelligible 00:09:07.07]

"lemming status of developers"

>[unintelligible 00:19:16.22]

"laying in"

>[unintelligible 00:27:29.18]


>[unintelligible 00:30:14.16]]


> Our transcripts are open source[0] on GitHub. Improvements are welcome.

Looks like you (or anyone else) can directly contribute these corrections

[0] https://github.com/thechangelog/transcripts/blob/master/podc...

> There’s a professor from the University of Moscow that – I’m not sure if he still teaches, or used to teach on astrophysics, and then hacks on Postgres for fun.

I'm pretty sure he's talking about Oleg Bartunov! He's one of the best speakers about Postgresql at Russian Highload conference: https://www.youtube.com/watch?v=uhvqly8MtoI&list=PLH-XmS0lSi...

Title has a typo: What's so exciting about Postgres

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact