"The old syntax was meant to be only deprecated, but it was accidentally completely removed."
How could Oracle have possibly "accidentally completely removed" the old syntax for setting a password even though it "was meant to be only deprecated", without being completely incompetent?
I can not in my wildest imagination come up with a scenario in which a competent developer could possibly "accidentally remove" something like that. If they made such a huge "accident" with that security feature, what other terrible negligent "accidents" lurk just beneath the surface of this new release of MySQL?
How can you "LOL" away such an incompetent "accident" like that?
"LOL away?" What gave you the impression that I was making fun of it? I am the one who found the bug and reported it, and this is the explanation I got. Since the person I talked to is someone I trust, I don't have reason to think otherwise.
I can elaborate further and tell you why this particular error was likely to happen: it's because since the beginning MySQL was built with root user without password, and most of the tests were running in such conditions. That was the culture of rapid growth that gave MySQL its popularity and the developers are paying now the technical debt that has left so many features with poor testing behind. What Oracle is doing now is trying to redress that situation by strengthening the defaults, for which I commend them. Being a QA developer, I am less pleased with the quality of tests that let such mistake pass, but I know that such things happen, and it does not necessarily mean that there is a can of worms behind this bug.
It's not my impression that you're making fun of it, but that you're trying to excuse it as an inconsequential "accident" instead of incompetence. Accidents like that don't happen without causes like incompetence or negligence or malice. Accidents like that are a symptom that something is deeply wrong.
To describe our process:
We have at least 2 code reviews, ~90% test coverage before we release a feature in a development milestone. It was then caught by at least 3 people that I know of prior to public release. Since this is not a GA release, it was decided to fix the bug in the next version.
So much MySQL hate these days... I should try Postgres out but here's why I still like and trust MySQL or MariaDB/Percona...
1. It's reliable/durable
2. I know the tools (mysql command line is friendly to me)
3. It's performance is predictable. 99% of issues with performance are easily solved with the right index
4. I don't find myself needing many SQL features beyond the basics.
Also, I'm a happy user of other tools to compliant mysql. Including redis and elasticsearch. Both of these tools IMO compliant MySQL.
I'm no fan of Oracle by any means but at least MySQL is still an open source database. Perhaps when I have free time before the next startup I'll give Postgres a good look. For now I'll probably try it out with AWS redshift...
I would personally contest that. I've seen MySQL corrupt tables before and the insanely unhelpful defaults which make MySQL accept and alter invalid data really don't strike me as neither reliable nor durable.
Your other points are valid reasons for sticking with MySQL though. If you're happy and the tool does what you need, that's fine.
But once you've seen what you can do as you gain access to the more advanced SQL features and once you've burned by MySQL not starting up because of data corruption, then you might want to investigate other options.
Postgres happens to be a very good alternative with good tools, predictable performance, very advanced SQL features, but also really good reliability and durability.
Corruption issues are fairly true with MyISAM engine. InnoDB did a lot to help reliability of tables (it can auto-correct them in cases where you had a hard poweroff or other related issues). No more needing to manually run table scans to detect corruption.
What "advanced SQL features" do other DB products have that you cannot do with MySQL?
CTEs, range types, window functions, custom aggregates, functional indexes, conditional indexes, proper utf-8 support and once you bring Postgres with its unique features into the mix: kick-ass JSON support (including the usage of indexes), array support (any my best friend array_agg())
Regarding corruption, it's an anecdote, but my girlfriend has lost a ton of data for her thesis related to innodb corruption. It took me hours of bit-twiddling to get the data back (and into Postgres where it's safe and accessible since then - on the same hardware, so don't blame this to broken hardware)
Now I can totally accept that this might have been a one-time fluke, but I personally really have trouble trusting a DBMS once it has lost data due to causes other than user-error or hardware faults.
It's not pretty, but you can get the equivalent of range types using variables[1].
Regarding corruption, it's an anecdote, but my girlfriend has lost a ton of data for her thesis related to innodb corruption. It took me hours of bit-twiddling to get the data back (and into Postgres where it's safe and accessible since then - on the same hardware, so don't blame this to broken hardware)
It's not safe until it's backed up. I don't care if it's a $100k oracle installation, It's not safe until it's backed up. Even then it's suspect...
Coming from a MySQL background I didn't understand the fuss over CTEs ...until I worked with a SQL Server guy. That completely opened my eyes to how _sane_ things could be, vs my MySQL work-arounds.
As a long-time and fairly technical MySQL user and only light experience with PostgreSQL, can you suggest the best reading material (in addition to the user manual) on the subject?
Comparatively, for MySQL, my recommendation would be 'High Performance MySQL'.
Very minor thing, but funny enough, I was working with MySQL just earlier and need to use my one in a blue moon quota of Full Join ... turned out MySQL doesn't have it :( (yes, I know how to emulate it with union).
So long as it possible to redefine the mode at will on a per connection basis or on startup then the integrity of data held within MySQL cannot be guaranteed.
Being able to change it on a per connection basis is required for earlier applications. If we use Wordpress as an example, upon initial connection it will set the SQL mode to what it is compatible with (magento, drupal, others also do similar).
Integrity can also not be upheld if applications lie. i.e. I always enter the same incorrect birthday if an application asks me for this, but I don't think they have a reason for it.
So, MySQL, as of 5.7, no longer accepts dates with year 0, and it also stopped converting unrecognized date strings into the date 00-00-0000? That would be a very positive change.
It has "not supported" them for a long time. But it wasn't the default, due to backward compatibility (of course the decision that lead to this in the first place is no less suspect).
Though "you shouldn't have to", my first recommendation for anyone setting up a MySQL installation is to go to the Percona Configuration Wizard and build a much more appropriate configuration file (with strict and enforcing defaults, better performance and a whole bunch of other things).
> 3. It's performance is predictable. 99% of issues
> with performance are easily solved with the right index
This is one of my least favorite things about MySQL. The "explain" feature is much less informative than that offered by MSSQL or Postgres, so figuring out what indexes or tuning to apply is harder than it needs to be.
And then when you do have everything indexed correctly, MySQL is significantly dumber (edit: looks like I'm significantly dumber! see correction below) about using those indexes than MSSQL or Postgres.
For example: those two can do index-only SELECTS. If table "foo" has columns A, B, C, D, E, and F and I've created an index on A and B, why can't "select A,B from foo" be served directly from the index? Postgres and MSSQL do it, and it's a huge performance boost in those two.
Postgres also lets you do some amazing things with partial indexes, indexes on computed values, etc.
MySQL might not be horrible here, but out of the 3 major relational databases I've used, it's clearly the weakest at this so it's funny to hear it touted as a strength.
Edit: Good news - I was wrong; MySQL does support index-only selects (covering indexes). User morgo also pointed out the improved JSON format for explain as well.
If you do not like explain, may I suggest using `EXPLAIN FORMAT=JSON` - it is new to MySQL 5.6, and contains additional meta data. There is also optimizer trace, which provides significantly more meta data. Some good examples in these slides: http://www.slideshare.net/oysteing/how-to-analyze-and-tune-s...
That's great to see. Thank you for the correction. I did not know that information was available. I wonder if there are others like me who didn't know this because they've been doing the plain-vanilla EXPLAIN like always.
Under your point (3), it is important to note that this has been a major focus area of MySQL. I have two comments to make on it:
1) Peak throughput is not as important as consistent throughput (i.e. less variance between response times). This makes it a great fit for front-facing web apps, and we have Percona/Facebook/Google in particular to thank for their contributions in improving this.
2) Often users do not get the performance they are entitled to, because they lack the visibility. MySQL 5.6 and above (and particularly mysql 5.7) have performance_schema to be able to track down and diagnose issues. Memory/transactions/stages/replication/prepared statements/statements/mutexes.. these things are all instrumented by Performance schema. Just today I saw a new query written to show a breakdown of latency per schema/per statement type: http://dba.stackexchange.com/questions/94723/getting-databas...
>4. I don't find myself needing many SQL features beyond the basics.
That's probably why you still like it. As it stands, MySQL has so many shortcomings it's not even funny.
Version 5.7 is the first one to support multiple triggers per event. Beyond that, and just to name a few, you can't have subselects in views, you don't have CTEs (good luck storing hierarchical data in an adjacency list), no materialized views, CHECK constraints are parsed but ignored.
I'm glad it works for you, but the minute you need to do something slightly more complex, MySQL gets in your way.
There have been several times in my career where I have missed full SQL support in MySQL databases other people have setup and left for me to maintain. Choosing MySQL isn't the worst thing one can do when building a new system, but just because the initial designers don't intend to use advanced SQL features doesn't mean future use cases wont benefit from full SQL support.
Indeed. Just these days the lack of materialized view has been giving me troubles. It would make a certain process about 50 times faster without touching code at all, and it would take me all of three queries.
Granted, the real problem is a poor design of the system, but materialized views would enable me to improve what others have left behind.
Try this. Download mysql sources. Install clang 3.6. Try to compile with all sanitizers. Last time I did it (it was percona code) it found a leak somewhere in SQL parsing code.
No kidding.
It is my habit from old days to compile everything.
MySQL is a fine DB for most cases. There are some specific features it lacks that do make it annoying (and make Postgres a better choice):
1. Indexes are too simplistic. You can't index the output of a function.
2. InnoDB lacks full text search. MyISAM has it, but MyISAM is bad since it doesn't support transactions.
3. Unicode support by default does not support all Unicode characters. That's right, even though it says UTF-8, not everything is supported, and you have to specifically enable support for some character subsets for your DB.
4. MySQL replication is... how should I put it? Delicate. There is no integrity checking, and since by default it's just replaying an SQL log you can easily get inconsistencies between the master and the slave. There are lots of ways to confuse replication, such as `INSERT INTO my_table (foo) VALUES (RAND())`. There are no built-in tools for integrity checking the slave, and the third party tools that exist have to resort to some really crazy things, like re-inserting tables. I could go on about replication, and its issues for a while, but I want to move on to the other points.
5. No transactional DDL. This really sucks when using with something like Django migrations.
6. No point in time backups. You either use mysqldump with a transaction (you aren't using MyISAM, right?), or you stop the database server.
7. Logs suck. No, really, have you tried debugging an issue with your queries, deadlocks, configuration errors, etc.? MySQL's server logs (not query logs), are not very verbose, and what the do write is mostly useless.
8. Row level locking semantics are at times doing unexpected things. Last time I used MySQL for complex real time write-heavy stuff, I actually moved to advisory locks instead (the support for which is not exactly great in MySQL and could be expanded). This invites contention, deadlocks, and other nastiness where it isn't necessary.
9. No IPv6 support.
10. Can't index Archive engine tables.
11. Can't partition tables by any arbitrary value. It has to be only specific types which makes it too restrictive.
12. GIS support is not really there. Lots of basic features aren't supported.
13. No materialized views. You can simulate it with triggers, but that's not nearly as convenient.
14. No async drivers.
Note that these are very specific features. If you don't use them, good for you. No reason to switch away from MySQL just because. It has some advantages over Postgres as well:
1. Simple user management.
2. Simple database management. No schemas, objects, etc., just tables and views here.
3. INSERT IGNORE and REPLACE! You don't need to create triggers for these.
4. Decent performance out of the box, and lots of knobs to turn when tuning.
5. Generally, doesn't require a ton of setup time. Install it, create user and DB and code away (Postgres and lots of others have this too, but it is a good feature).
6. Large amount of community brain share, so you won't be braving a new world here.
So, basically, go ahead and use it if it works, but I'd say Postgres deserves a try too and your 4 points apply to it equally as well.
3. I agree utf8mb4 should be renamed "utf8". When this was introduced it was given a new name to support downgrades, but the feature does exist :)
4. RAND() is actually deterministic (via other meta data written to the binlog). In any case, MySQL also supports Row-based replication and it is the proposed default for 5.7.
5. I would like to have this feature too :) It is important to note that few databases have this though.
6. MyISAM is no longer the default. PITR works fine with InnoDB.
7. The logging verbosity can be changed. In MySQL 5.7 there is a new server logging API which makes sure the output format is consistent. For some of the meta data you are suggesting, the better place to retrieve this is performance_schema.
8. You are probably talking about gap locks etc. These are required for statement-based replication, with Row-based they are not required.
10. I would suggest that ARCHIVE engine is of limited use-cases.
11. Please see http://mysqlserverteam.com/the-mysql-5-7-6-milestone-release... (search "innodb native partitioning"). Allowing more arbitrary functions would most likely require global secondary indexes. 5.7.6 contains an important change to move in that direction.
12. GIS support is improved dramatically in 5.7, including InnoDB support. Please take another look :)
14. I agree on this one. Async drives is becoming very important :)
Thanks. I'll add one more and maybe this is already in the works: last I checked you couldn't rebuilt an index of an InnoDB table without re-inserting all the rows into a table. This meant that adding an index to a huge table was very slow.
I didn't know about a lot of these improvements, and am happy that they are happening. I moved away from MySQL around 2012-2013 because of the above reasons, so glad that things are improving.
Redshift is great for some use cases, and completely unsuitable for others. If testing the waters with Postgres is your objective, Redshift will not be a good representation.
I have a similar opinion. Not to mention I have a many clients/projects already built on MySql databases. Moving to a different DB is expensive and unnecessary.
Well it's HN, outside of Postgres / Redis everything sucks. And yes you're right on MySQL it's reliable.
Honest question here, is there an equivalent of Percona tools for Postregres?
MySQL doesn't excite me at all anymore. It used to be my DB of choice but as of late I've been using postgres and other than tools being somewhat lacking (this is getting better all the time, there are now GUI browsers for postgres that rival sequel pro). Nothing was really keeping me on MySQL other than it was generally the prefered DB for most OS projects and I had used it in professional settings multiple times. Postgres's JSON store is very interesting and I've read a number of articles on other cool things in Postgres all while MySQL was more or less silent. I'd love to setup a postgres server at work and benchmark it against our MySQL DB, the one thing really holding me back is our use of ENUMs but I don't think that's going to be a big barrier and I hate them anyways.
Very interesting! Thanks for the link, even with that I'd rather convert the ENUMs to just VARCHARs as our MVC framework doesn't support ENUMs but it "Just works" because it sends strings to MySQL and MySQL stores them as INTs that map to ENUM values. Because of this the test framework never played nice with creating test DB's that used ENUMs.
I know what you're saying, but when I hear people being excited about data stores, I get a little worried. (Being excited about data stores gave us the rise of MongoDB. MySQL tables corrupting because you used MyISAM like a well-intentioned fool is "exciting", too.) I've moved to Postgres for most things because it's not at all exciting for the majority of use cases. It does a thing, does it well, and doesn't break. I honestly can't say the same about MySQL.
Yes, "Excite" is probably not the right word to use there. I share a wariness towards too much "excitement" about a new DB (I never really trusted Mongo and only used it to play with). In this case I don't see postgres as "new and unproven" but rather "proven and pushing the envelope" which is what excites me.
If marketing is to be believed, it does seem like MariaDB offers many points of attraction over MySQL. The fact that Oracle does not provide test cases or has proprietary extensions is a bit alarming from a software freedom perspective.
Its clickbait headline, or maybe they're going for outright sarcasm given how little is changing.
Technically its true that the install process is a little different but not much, and theres some minor changes in user configuration. I've looked into it and the most exciting "DBA level" changes are YEAR(2) is finally utterly expunged, and there's some new reserved words "NONBLOCKING" and "OPTIMIZERCOSTS". Thats about as exciting as it gets.
Its not transparent upgrade but its not like they're changing the default db engine to an embedded nosql engine, or changing the default char encoding to EBCDIC.
I would say the biggest change is that the default SQL mode will be the equivalent to 5.6's ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ERROR_FOR_DIVISION_BY_ZERO,
NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_AUTO_CREATE_USER.
This means that a number of statements that produced warnings (but proceeded) will now produce errors.
> The main effort of the team has been focused on speed, with performance reportedly improved from 2 to 3 times compared to previous releases.
Definitely good news, as MySQL came out during the peak of Moore's-law (or, more appropriately, a misapplication of) fueled idiocy where nobody seemed interested in investing in better performing software, since they all expected they'd have faster hardware next year. Now everybody's deploying to VPSes, so it's once again important that we get the most out of every single CPU cycle. I haven't gone through all the diffs, but if they're promising 2x to 3x improved performance, there must have been some pretty fundamental rewrites of old code.
Coming from the Oracle + DB2 world, what I'd really like to see is some improved performance analysis and query optimization tools. There have been some incremental improvements to EXPLAIN (the new 'EXPLAIN on a named connection' feature seems pretty cool for when I have 10+ application instances all connecting to my DB), but it's still got a long way to go.
Visibility is the most useful and always under-rated feature. If I can clarify some of the recent enhancements here:
- MySQL 5.6 introduced `EXPLAIN FORMAT=JSON`. MySQL Workbench uses this to visualize query plans (important as they get complicated.)
- Also in 5.6 was optimizer trace (find out why indexes were not used etc.) and performance_schema enabled by default.
- MySQL 5.7 adds cost information to `EXPLAIN FORMAT=JSON` (Workbench also understands it.)
- Workbench also has a set of performance views based on Performance_schema (based on a project called SYS). You can install it as standalone from here too: https://github.com/MarkLeith/mysql-sys
- The runtime performance data from SYS is very useful for making correct optimization decisions. Being an SQL interface is also friendly for writing your own scripts against it.
Also, to clarify on your first paragraph:
Yes, refactoring old code was one of the ways this was achieved. We picked refactoring targets based on performance and because some of the older code is not very testable.
A better overview link for what's new in 5.7.6 specifically is this one: http://mysqlserverteam.com/the-mysql-5-7-6-milestone-release...
To understand what's new in 5.7 overall: http://dev.mysql.com/doc/refman/5.7/en/mysql-nutshell.html
Specific to this blog post. Here is the page on upgrading: http://dev.mysql.com/doc/refman/5.7/en/upgrading-from-previo...
I'm happy to answer any questions, please feel free to ask away.