

20 tips for MySQL db architects - nocivus
http://www.scribd.com/doc/2565263/The-top-20-design-tips-for-MySQL-Enterprise-data-architects

======
axod
Obligatory complaint about how scribd takes 5 minutes to load and ends up as a
worse user experience than linking to the pdf.

Wait 5 minutes, then click on 'download'. _sigh_

~~~
ronaldbradford
axod, you can download these slides and other MySQL related presentations
directly from <http://ronaldbradford.com/presentations/>

~~~
axod
Very cool. Thanks I'll take a look there :)

------
mkn
The meta-point for me is that SQL is just broken, as are its many
implementations. We'd never tolerate a scripting language that made you learn
all the implementation details before you were effective with it. And yet we
tolerate this from database engines and the languages we use to interact with
them.

The opposite end of the spectrum is modern compilers. C/C++ compilers are
generally so good that most attempts at nuanced optimization are just
pointless; The compiler saw you coming and optimized your code behind your
back before you got there. Why can't storage engines be like this? Why do you
have to think not only about the type of your data, but how you want it
implemented as well? Are there technical reasons why a db can't take generic
data declarations ('string' instead of 'varchar(255)') and do the right thing
with it when you populate your db? Or treat the use of a transaction as a hint
that you'd like a storage engine that supports transactions?

I realize there comes a point where the db design has to be clamped down for
production, but in the design stages there's a lot of optimization that a 'SQL
compiler' could do behind your back. It currently seems like database product
designers have taken the lazy way out and decided that optimization should
live in the developers' heads rather than in the engine.

It just seems like SQL is this awful holdover from the days of COBOL and that
we seriously need a modern product that let's us think about our programming
problems rather than SQL's hangups. Am I missing something really basic, here?

Let me emphasize that, afaik, ronaldbradford knows his stuff, and I'm glad for
him if he can make a living off of his MySQL knowledge. I just wish that that
knowledge was embedded in the products themselves.

~~~
gaius
_I just wish that that knowledge was embedded in the products themselves_

Err, it is. Oracle, Sybase, MS SQL, Informix, all the major database products
have sophisticated query optimizers. You use SQL to describe the result set
you want, and it figures out the best way to get it.

Sadly the Web 2.0 world is full of people who actually believe MySQL is as
good as databases get, having never used anything else. It's probably a good
10-15 years behind the state-of-the-art, at least (and Oracle et al are behind
in niche areas).

~~~
mkn
I hear the points about query optimizers that you and the other repliers have
made. (And they are valid points, all. I was just misinformed about MySQL.) I
was more addressing the points made about database design decisions w/respect
to data type choice and their effect on storage requirements and speed. I was
wondering, specifically, if it is technically feasible to allow designers to
specify 'text' or 'string' at design time, enter in some typical data, and
have the engine choose the optimum type subject to its implementation
constraints.

That said, it seems like I need to educate myself on the various products,
though any prototypes I build will still use MySQL just because of the price
point. :o)

~~~
gaius
Check out PostgreSQL, Firebird and SAP/DB. I can't think of a single technical
or commercial reason to use MySQL.

~~~
newt0311
Correction. There is one reason. Its a lot easier to find MySQL devs. than
postgres, firebird, or SAP/DB devs even though the latter are a far superior
product technically.

------
tialys
This seems like it could be really handy if someone could perhaps narrate
along with it and help me understand what all of his talking points are about.

~~~
ronaldbradford
A video of the presentation can be found at
<http://technocation.org/node/533/play>

~~~
tialys
Excellent! Thank you.

------
ars
Small note about using unsigned ints: In PHP (and possibly others) they will
turn into signed ints when you read them.

If you read and write them and never show them, it's probably OK, but if you
have logic that checks for < 0, or the like, be aware of it.

But then again, if you have so many rows in a table that your auto_number has
reached over 2 billion, you probably have other issues as well.

------
Freaky

        ❖Average 25% - 30% saving on engagements
        ❖Better 60% (200GB System)
        ❖Best 78% (8GB per master with 12 masters)
    

What are they wibbling about?

~~~
ronaldbradford
As a database architect expert, I generally average a 25-30% reduction in the
Disk Footprint of the database, sometimes with zero impact for application
changes.

I use a very simple equation. DISK = MEMORY = PERFORMANCE Reducing the size of
the disk footprint, and therefore the amount of disk reads (disk is the
slowest of the 4 physical mediums) you improve performance. By then enabling
more data to be stored in the appropriately tuned MySQL Memory buffers, you
get a further increase in performance.

For more information on this point, you can contact me at
<http://ronaldbradford.com/contact>

------
tocomment
Why does he say varchar(255) is bad design?

~~~
petercooper
Because 255 is now just an arbitrary choice for a VARCHAR length.

Explanation: Prior to MySQL 5.0.3 (give or take a few point releases - I
forget) a VARCHAR column could be 255 characters in length maximum, so
VARCHAR(255) was often used as a default. Now, however, you can go up to
65,535 characters on VARCHAR, so if you're still using "255" then that seems
arbitrary and not well thought out (or your schema is just old).

~~~
ars
That's not totally true. Using a varchar of 255 or less saves one byte vs one
over 255. So if you don't know any particular size to use, 255 is reasonable.

And varchar(20) vs varchar(255) for a 10 character string use _exactly_ the
same space. So there is no reason to artificially use a smaller value.

~~~
petercooper
> So there is no reason to artificially use a smaller value.

That's not true either. As the document explains, MySQL's internal buffering
uses fixed sized columns matching the length limit of the VARCHAR column. So a
VARCHAR(255) will cause 255 bytes of memory to be allocated versus only 20 for
VARCHAR(20). Of course, it depends on what range of sizes your data fit as to
what you do :)

~~~
ronaldbradford
This is also multiplied by the character encoding, so when you have a table
with 10 columns at VARCHAR(255) because your too lazy to define them, and they
are all UTF8, so that's 255*3 for each column for each row when used in a
MySQL Internal buffer (not, these buffers may or may not be used depending on
the type of query).

When you have 100s to 1000s of queries per second, knowing and using your
memory wisely is very important, because MySQL unlike Oracle for example, does
not cap the amount of memory used for the Process, again using the Oracle
approach the PGA (Process Global Area).

It's always easy to take sort cuts in design, when your DB will never grow
above a few GB's and your users will never grow above say 10.

VARCHAR(255) is just as much a sign of little to no design, and therefore this
and many other beginner 101 mistakes are likely, and therefore large
performance gains are generally easily made in a system that now is needed to
perform.

~~~
gaius
Kinda the point of UTF encoding is it's variable-width; if you only need ASCII
then that's all you'll store. The MySQL people seem to have chosen to use the
worst possible case every time regardless; this is not a "feature".

------
ars
Watch out for using int's for storing IP addresses. It won't work with IPv6.

