Hacker News new | past | comments | ask | show | jobs | submit login
MySQL now shows its thread names at OS level for better troubleshooting (tanelpoder.com)
131 points by tanelpoder 8 months ago | hide | past | favorite | 51 comments

The improvement I would like to help my MySQL troubleshooting is that bad queries give better and more specific error messages than “You have an error somewhere”.

That’s not to downplay the complexity of providing that, I’m sure it’s technically difficult, but this is basically the worst debugging experience I’ve had of any language.


"you've got a syntax error at around here: [the entire query]. go consult the manual or something idk"

I feel like this is MySQL's model very well. A long time ago they traded some strictness for speed, because the average MySQL user was doing some basic CRUD apps and it never mattered. But adding useful checks and good messages everywhere requires a fair amount of engineer effort and probably costs more processing than whatever performance benchmark they are probably optimizing for.

It shouldn't be tough to tell you "a comma is missing in this line". currently you get another absurd not only vague, but incorrect error codes.

You say that but having worked on a DSL for a while, you end up sprinkling this code all over the place and things which might be simple loops, or string.splits, etc become much more gnarly.

I like Datagrip for writing SQL for this purpose.

I've used to write my queries in pgAdmin or VSCode.

I downloaded DataGrip after reading your comment, and damn, the autocomplete is just so good. It even shows me the arguments of Postgresql extensions like TimescaleDB.

This almost reads like an ad, but I just wanted to thank you for changing my life :P

Is this better in MariaDB? Over the last hour I have been writing some Python code talking to MariaDB, and the errors I got about syntax issues all seemed to point fairly closely to what the issue was. Mind you, these are fairy simple queries (perhaps more complex queries produce worse errors?)

Honest question.

Been migrating a 5 year old Percona MySQL galera cluster over to MariaDB galera. Galera multi-master clusters, for my use case, are a huge win. A month ago I had started down the path of replacing it with CockroachDB, but Cockroach only supports UTF-8 and my use case requires LATIN-1, sadly.

I wasn’t aware of Galera, that’s really interesting. how is the transition from percona to mariadb? any major pain points?

Percona -> MariaDB has been super smooth. With Percona over the years I've had a few updates that didn't apply and I had to go in and fix things like repo locations or keys as things evolved, but mostly it was just a workhorse. I guess one issue we had was it would never autostart on a reboot (despite "systemctl enable"), but I was never able to track it down in a reproducible way.

With MariaDB I'm using the digitalocean repos and I guess we'll see how stable those are.

Galera has been great, have used it for 5+ years and only a couple hiccups. This is a rather small database, but nearly mission critical. A couple times the cluster failed to start and I had to figure out the most recent node and manually bootstrap from it. I built some small shell tools to tell me if the cluster was happy, and bootstrap, because I had to do it so in frequently I could never remember the queries and files involved, and didn't want to have to read through my docs when the cluster was down.

Our use case is a small e-mail relay, so the win with galera is each e-mail server also runs a database node, and since they are master/master we can take one node down at a time for maintenance without having to do any failovers or the like.

thanks! good to know

It's no better in MariaDB.

This took some getting used to after six years of working with Postgres.

In my current set-up (inherited), the unit tests run using H2 while running code talks to MySQL, so I have two databases that give unhelpful generic error messages.

For that, we can sell you something called Oracle!

...in which you get even worse messages, but for more money!

wow, is this true?

In my experience (a 25+ year Oracle database geek), MySQL error messages confuse me more and Oracle's are better. But Oracle also has some error messages where you have to know how parsers (and semantic/typecheckers) work, in order to understand why the message was worded like that.

But Oracle usually does point out the exact character/token in the SQL statement where the error originated from.

it helps when you surround each variable/name with backticks, then you get better error messages. But I agree MySQL should do that by default

Setting thread names is quite helpful for debugging. One of the projects I work on hoists Linux tasks into native threads, and we use the task comm and PID as the name, which is super helpful for debugging. That being said, each platform has their own weird quirks when it comes to names: Linux only accepts names up to 15 characters, for example. And macOS only lets you set the thread name from your own thread, so even though it uses the same pthread_setname_np call the parameters it takes are different :(

By the way, JVMs also set thread names for their own housekeeping threads (on Linux at least, not sure from which version) and allow applications to name them too. I wrote an example here:


“_np” stands for “non-posix” and is used for vendor-specific extensions; non-portability is expected.

Nice, this will be really helpful because it will end up in system monitoring logs/graphs. I've had to hunt down issues before and when you can see what it is that's eating a bunch of CPU it can be a very helpful hint.

Thanks MySQL team!

What's MySQL like as a database these days? I get the impression that it used to be a bit of a mess but it's improved a lot and has some advantages over Postgres.

Did anyone choose it over pg, and if so, why?

I'm a long-time (~1995) Postgres fan here, and my opinion is that MySQL today doesn't have all the weird restrictions and limitations that it used to have in the past. I don't get super deep into advanced features, but when I've used it lately it has had all the features I wanted. Though I was looking for a JSON data type earlier today and that didn't seem to be in there.

I did pick MySQL over Postgres for a old project that I'm currently working on again, the primary reason was easy multi-master replication via galera. I know that Postgres has long had many cluster stories, but I always struggled to even figure out which one was the right choice to make. Most of the clusters I've deployed ended up using DRBD and heartbeat rather than one of the Postgres native options.

As far as advance features, quite a few years ago I was doing this exercise to generate reports on some sample payroll dataset. The expectation was clearly that you would run a query, massage that data in some code, then run a bunch of other queries in that code to further correlate the data. I forget the exact problem that was posed.

I'm not super strong in database advanced features, but in the end I was able to come up with some fairly simple SQL, using materialized views or CTEs or similar, that produced the report entirely using SQL, in a fairly short runtime.

This was my first real experience with using MySQL beyond the basic features it had in decades past, and I was quite impressed with it.

> Though I was looking for a JSON data type earlier today and that didn't seem to be in there.

Which JSON type are you missing? I store and query JSON extensively (probably too extensively) in MySQL and most things work great.

Thanks, I misunderstood the docs page for JSON, it says it is an alias for LONGTEXT, so I just used the latter, I didn't understand that it was doing more than just treating it as text. In my case I'm not expecting to do more with the JSON than treat it as a BLOB, so I didn't dig too far. I just changed my schema to JSON and I'm re-ingesting test data now.

Are you using MySQL or are you using MariaDB? They are distinct databases with different features.

MySQL has had a full-featured native JSON column type since MySQL 5.7 (Oct 2015). Here's the current manual page for MySQL 8: https://dev.mysql.com/doc/refman/8.0/en/json.html

MariaDB does not have a JSON column type, instead aliasing it to LONGTEXT. MariaDB takes an alternative approach of providing JSON manipulation functions which work with stringy types. https://mariadb.com/kb/en/json-data-type/

Note that in some Linux distributions, when you try to install "mysql" packages you instead get "mariadb", so you may need to query the server to see which one it really is!

This was in Maria, thanks for clearing that up. I thought I had mentioned MariaDB but I guess that was another response, sorry for that confusion.

>I did pick MySQL over Postgres for a old project that I'm currently working on again, the primary reason was easy multi-master replication via galera.

I plan to use CockroachDB as a distributed data store since it's protocol compatible with Postgres. Still in testing phase, but the deployment is very easy.

Agreed, the deployment was very easy, that was one of the reasons I wanted to switch to Cockroach from MySQL+Galera. It took me less than a day to get ansible roles to set up a Cockroach cluster, but around a week for the MySQL Galera cluster, to get it completely right. And that was starting with somebody else's Maria+Galera Ansible role.

My blocker was that I'm using it in part for Postfix maps, and the Postfix pgsql module can't be configured to us anything but LATIN-1 encoding, and Cockroach only supports UTF-8.

CitusDB shards PostgreSQL nicely for both data ingestion bandwidth and load balancing + HA

I haven’t used PG in production so I’m probably biased, naive or both, so I know this isn’t a direct response to your question, but…

MySQL has been great for every project I’ve used it in and haven’t had any issues. Largest app we used it in was Twitpic which handled high query traffic and along with the built-in replication, it was easy to use. We had replication lag on spinning disks (this was 2009) but switching to SSDs fixed it. Backups with xtrabackup from Percona is a key piece though for taking incremental or full DB backups.

I think back-in-the-day, PG didn’t have replication built-in and that always kept me using MySQL, but I’ve honestly never understood the bad rap MySQL got in years previous as I’ve never run into any of the described issues and it has been rock solid for me in apps of all sizes.

It's mostly fine and certainly has some advantages, but it also continues to have weird behavior that can randomly creep up. My recent favorite is[1]:

> MySQL parses but ignores “inline REFERENCES specifications” (as defined in the SQL standard) where the references are defined as part of the column specification.

Because of course it's very reasonable to accept a foreign key specification but completely ignore it.

[1]: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-foreign-ke...

We gave the geospatial features a try with a real estate dataset that contains ~50M rows. It's crappy to say the least. Half-baked features, poor performance, random crashes, barely documented gotchas with no good reason.

At least the MariaDB folks try to fix the problems that are passed down to them. MySQL just seems content to tick the box, "Yeah we have that feature too, please don't switch to PostgreSQL!" and leave it at that. MySQL is okay for typical CRUD workloads with old-fashioned scalar types, but don't expect much with respect to the newer, fancier features.

The list of companies using MySQL as their primary database is absolutely massive. It's a sizeable chunk of the world's economy and it's not just used for CRUD workloads. InnoDB is admittedly much better at OLTP than OLAP, but OLTP encompasses a lot more than simple CRUD workloads.

> At least the MariaDB folks try to fix the problems that are passed down to them

This assertion simply does not make sense. MariaDB hard-forked its codebase from MySQL's a full decade ago. They were able to use/port some open source MySQL code through ~5.7 but that seems to have stopped in recent years with MySQL 8, probably due to its major architectural changes.

These days MySQL and MariaDB each have some nice features that the other one lacks.

With respect to what is "passed down", by whom and to whom: the creator of MySQL is literally the chief architect at MariaDB, and many other early MySQL AB engineers work at MariaDB. Meanwhile most of MySQL's poor reputation today is a lingering effect of its early versions, largely before the Oracle acquisition of Sun. In this context, I am continually confused as to why some Postgres users have a higher opinion of MariaDB than MySQL.

Edit to add: I failed to mention that I do absolutely concede that for a new application with heavy geospatial component, Postgres+PostGIS is absolutely the correct choice. InnoDB's geospatial support is comparatively much newer and less widely used.

MySQL’s spatial indexes work just fine and yield extremely low-latency queries when used correctly.

Well, the last time we tried to insert a bunch of complex polygons (hundreds of points, multiple parts, etc.) into an indexed column using several concurrent transactions, MySQL crashed.

I think to think of it as an interface, https://dbdb.io/browse?compatible=mysql

These days you have shiny companies like PlanetScale and SingleStore implementing the MySQL wire protocol.

MySQL been rock solid for us, handling a million+ queries per hour on a fairly large database, running continually for over 10 years (other than occasional updates), but we're not using any advanced features, just run of the mill relational db with some full text searching. The tooling (phpmyadmin, percona xtrabackup) and knowledge available online are also a big plus.

I use MySQL (and increasingly, MariaDB) because I prefer the tooling. I use software like PHPMyAdmin and mycli, and when I need to use Postgres or MSSQL the tools available do not live up to the same high standards.

What does PHPMyAdmin offer over, say, Postico? I avoid MySQL because of its history of being "a bit of a mess" and its lack of tooling.

I’ve heard of it for a starter…

Tryin to find Postgres tooling like that hasn’t been very productive. Thanks for the tip just for this one.

I've used both professionally. I only have a mild preference for PG, MySQL has been perfectly serviceable for me.

I'm using MySQL 5.6 and 5.7 and the error messages for syntax errors in queries are next to useless.

In terms of popularity, it sure looks like that ...


It's terrifyingly bad and its global buffer pool mutex that's designed to look smart on benchmarks falls to pieces if you have a few dozen queries running at the same time. The only places I've had to suffer through it were ones that made the fatal decision early in the life of the company when they didn't have anyone on staff qualified to make that decision, but then were stuck with it forever.

This comment looks like an attempt at implying some kind of deep knowledge, but you're demonstrating a terrible ability to convey information. "Global buffer pool mutex that's designed to look smart" does not fall apart if you have a few dozen queries running at the same time. Source: the place I work at has a multiple of several thousand of what you wrote, and surprisingly - nothing falls apart, and we've been using MySQL for decades. Like any software, it has its negatives and positives.

Our job is to be aware of shortcomings and work around it. This rule applies to any software, any language. I'm sorry that using MySQL has made you so mad, but deliberate spreading false knowledge simply because you're angry helps no one.

I think you need to update your knowledge about MySQL and not bang on about v5.1 and earlier.

rearranging deck chairs on the titanic

How is this comment useful? It resembles what children comment about popular online games when they don't like them.

If you don't use MySQL, it doesn't mean you're aware of who does, how they use it and what's the revenue produced.

It'd be great if you could expand your post and add details, perhaps even some (anecdotal) evidence that sheds more light into your point of view.

Hmm? MySQL is perhaps the most popular open source database when looking at # users served. Uber, Facebook, Twitter, all use it as their underlying data store. I'd say it's _far_ from "titanic", even if Postgres is quite popular these days.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact