
MySQL Text vs. Varchar Performance - mrdraper
http://nicj.net/2011/01/20/mysql-text-vs-varchar-performance
======
lysium
First, he mentions that VARCHARs are stored in-row, and then he wonders why
his server's performance goes down when he joins on a VARCHAR(30000)? And uses
EXPLAIN as the very last step? I don't quite get it, honestly.

~~~
nerfhammer
He doesn't seem to be aware that there's a difference between storage engines
when he says that "mysql" stores varchars in-row, when that's really MyISAM.
MyISAM is now considered obsolete and you shouldn't be using it anyway. Innodb
treats large varchars exactly like text/blob columns.

~~~
mildweed
Could somebody please point me to a longer yet still intelligent comparison of
MyISAM and InnoDB? Thx.

~~~
nerfhammer
Short version:

MyISAM doesn't support transactions and only supports table-level locking, so
a heavy concurrent read/write workload will bring it to its knees. Generally
you shouldn't use MyISAM anymore unless you know what you're doing and have a
specific reason for doing so.

InnoDB is much more crash-safe, supports transactions, and can support
concurrent read-writes using row-level locking. It's generally a much more
advanced storage engine than MyISAM and it's what Oracle is investing its
development efforts in going forward.

------
fleitz
This is why I like VARCHAR(MAX) in SQL Server if the row is < 8192 chars it's
a VARCHAR if it's > 8192 it's TEXT. You only pay the penalty on rows that
exceed that threshold. Best of all you can store 2GB of text in a VARCHAR(MAX)

<http://msdn.microsoft.com/en-us/library/ms176089.aspx>

~~~
singular
Using VARCHAR(MAX) still seems to have a significant performance impact,
however - [http://richardlees.blogspot.com/2010/07/varcharmax-
performan...](http://richardlees.blogspot.com/2010/07/varcharmax-performance-
in-sql-server.html) \- perhaps the fact it has to check whether the varchar
exceeds 8192?

~~~
dspillett
It doesn't just have to check the length of that one field - it needs to check
that the whole row wouldn't pass the per-row length limit even if the field
itself doesn't.

[N]VARCHAR(MAX) columns can lead to greater page tearing and other performance
issues over time if their value is changed between a size that would fit in-
row and a size that won't too - so as well as the hit on INSERT/UPDATE
performance there may be a (less significant) hit on read performance over
time too, on top of the slight hit from simply needing to check where the data
is and go find another page if it isn't in-row.

For the most part [N]VARCHAR(MAX) should be used as a more efficient (and
_far_ more convenient) replacement for [N]TEXT.

It should only be used as a replacement for [N]VARCHAR(<something>) when the
data in the column may need to break the ~4000 characters mark (for NVARCHAR,
the limit is ~8000 characters in the case of VARCHAR) or when the total bytes
for a row will.

------
MikeTaylor
Can anyone offer a TL;DR?

~~~
akronim
if things are slow, check EXPLAIN first.

~~~
robryan
Which strangely seemed to be one of the last things this guy did, mysql
explain and profile are usually enough to see what is going on.

~~~
barefoot
Good advice. "Explain" in this case wouldn't be enough to work out what's
going on, however. In both the TEXT and VARCHAR cases he expected the queries
to be using temporary tables so the output from mysql explain (both showing
the use of temporary tables but not the storage medium such as disk or memory)
would not have helped.

~~~
rapind
Plus I kind of enjoyed the round about way he explained it.

------
eck
I'm confused; in the example, couldn't he avoid the filesort entirely by just
adding an key on t2.t1id? Is it really optimal to be sorting the whole table
-- no matter how optimally -- on any common serving path?

------
eli
To be clear, it's not just "when a TEXT/BLOB is included in a sort" that
causes on-disk temp tables. One would expect to take a hit when sorting on a
TEXT field.

But actually it's much worse than that: you can get on-disk temp tables if
you're selecting a TEXT field while doing an ORDER BY or GROUP BY that uses
any columns from the second table in a join.

------
trustfundbaby
Learned a lot from that, very fascinating ... would have liked to see
benchmarks after he made his adjustsments and a comparison with an innodb
version of the database.

