Hacker News new | comments | show | ask | jobs | submit login

Extremely important: If you use mysqldump, make sure it's also using utf8mb4. There is a very high chance it defaults to utf8 and silently corrupts your data.

Oh my god, this is both terrifying and could one day prove to be the most valuable comment I've seen on HN.

I love MySQL, but the fact that they didn't make utf8mb4 the default for mysqldump (instead of utf8), when they created utf8mb4 for database use... is one of the most irresponsible and negligent pieces of engineering I've seen in a long time, considering the amount of data worldwide backed up with mysqldump.

The fact that a program designed solely to back up databases would default to silent corruption like this is mind-boggling. At least they've fixed this in the future for MySQL 8.0.

Imagine how many companies out there are relying on backups that are already corrupted...

is your company one of them?

This is an important reason why testing your backups is a critical part of backing up.

It can be a very subtle issue, though. How would you recommend testing backups for rare corruption? Especially when live data is changing.

With this I had used a backup and nobody even noticed the problem for three weeks, thankfully I had some more direct backups too...

This is oft-repeated advice, but, in this context, it ends up being little more than a platitude.

Since the complaint here is of corruption that is silent, the level of testing required to catch it would be extraordinary.

Thanks, that would be:

mysqldump --default-character-set=utf8mb4 -u user database > dump.sql


Yeah, that should work fine. It's easy to test, just output a row that has an emoji.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact