I've seen so many PHP apps that store UTF-8 encoded text in Latin1 fields, and it kind of works, until you correctly configure your client library, or try sorting the data, or check for equality...
The problem is so common, that the popular MySQL client app Sequel Pro even has an option to set the encoding to "UTF8 via Latin1", which tells the server to output Latin1 but then treats the result as UTF-8.
So once you learned this the hard way, you'll change the db encoding to utf8, and you'll be glad to have finally fixed the encoding mess, only to stumble across the fact from the original article that utf8 is not really utf8 in MySQL --- oh the horror!
ALTER TABLE database_name.table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci.
Verify that all of your columns are utf8mb4 with:
SELECT table_schema, table_name, character_set_name, collation_name
WHERE table_schema NOT IN ('mysql','information_schema','sys','performance_schema')
AND character_set_name is not NULL
AND character_set_name != 'utf8mb4';
You may have to update your client libraries (like Perl-DBD) if you're still using something like CentOS 6.
If your data model layer supports it, it's often better to just use raw binary instead of any particular encoding. The main features utf8mb4 provides are validation and sorting. Most people are just using an ORM and use Unicode internally in an application, so validation of the encoding is nice, but not essential. And for most applications it doesn't make sense to sort arbitrary utf8 strings in the database layer. (Sorting Unicode strings in general is very complicated, and 99% of the time it's not something you really need.)
Because of the way MySQL rolled out `utf8` (a strict subset of UTF8) then `utf8mb4` (which is a full UTF8 implementation), the other top result is similarly poisoned where the directions describe using `utf8` and have an addendum describing `utf8mb4` (which isn't hard to miss).
WHERE t1.TABLE_SCHEMA like 'omtest'
AND t1.COLLATION_NAME IS NOT NULL
AND t1.COLLATION_NAME NOT IN ('utf8mb4_unicode_ci')
AND table_name = 'emails';