Tag Archives: InnoDB

Tool of the Day: rsnapshot

rsnapshot is a filesystem snapshot utility for making backups of local and remote systems, based on rsync. Rather than just doing a complete copy every time, it uses hardlinks to create incrementals (which are from a local perspective a full backup also). You can specify how long to keep old backups, and all the other usual jazz. You’d generally have it connect over ssh. You’ll want/need to run it on a filesystem that supports hardlinks, so that precludes NTFS.

In the context of MySQL, you can’t just do a filesystem copy of your MySQL data/logs, that would be inconsistent and broken. (amazingly, I still see people insisting/arguing on this – but heck it’s your business/data to gamble with, right?)

Anyway, if you do a local mysqldump also, or for instance use XtraBackup to take a binary backup of your InnoDB tablespace/logs, then rsnapshot can be used to automate the transfer of those files to a different geographical location.

Two extra things you need to do:

  • Regularly test your backups. They can fail, and that can be fatal. For XtraBackup, run the prepare command and essentially start up a MySQL instance on it to make sure it’s all happy. Havint this already done also saves time if you need to restore.
  • For restore time, you need to include the time needed to transfer files back to the target server.

Good Practice / Bad Practice: CREATE TABLE and the Storage Engine

When you write your create table statements, always make sure that you make them non-ambiguous. That way even though other servers might have different configurations, you make sure your table will be created in the same way.
Imagine for instance you are developing an application on a development server, nicely storing all the scripts you need to create the same database on your production server. If the same script creates a table differently on both servers, that might cause you a lot of headache later on. At Open Query, we strive to minimise (or preferrably eliminate) headaches.

One of the parts of the create table statement that has the largest impact is the storage engine specification. When you omit the storage engine from the create table statement, your table is automatically created with the default storage engine type configured for the server. Since the storage engine is a very important choice when designing your tables, you want to make sure that it is always the correct type.

Here’s an example: instead of writing CREATE TABLE city (city_id int, city_name varchar(100)) you should write: CREATE TABLE city (city_id int, city_name varchar(100)) ENGINE=InnoDB

It is a simple adjustment, but it will save you from possible problems if you just make a habit out of specifying the storage engine.

But wait, there one more thing! It’s also very important you have sql_mode=NO_ENGINE_SUBSTITUTION in your my.cnf, otherwise your table may still silently become a MyISAM table if your desired engine (for any reason) is disabled in the binary, configuration, or at runtime. With this setting error, such a situation will cause an error – so you know for sure.

In our good practice / bad practice series, we will provide you with byte/bite sized pieces of advice on what we consider good (or bad) practice in MySQL-land. The topics can be just about anything, so expect random things to come up. Also, the level of advancedness greatly varies. A topic might be a no-brainer for some, a reminder for others and a revelation for a third person. We strive to tender to all of you!

How many files does InnoDB have open?

If you use innodb_file_per_table = 1 and innodb_open_files = X (whatever amount is suitable for your server) there’s no way internal to MySQL for finding out how many IBD files InnoDB actually has open. Neither SHOW GLOBAL STATUS LIKE ‘innodb%’ nor SHOW ENGINE INNODB STATUS provide this information.

Many sites do have a growing number of tables, so you’ll want to know when it’s time to up the number (and potentially also open-files-limit). Solution: sudo lsof | grep -c “\.ibd$”

What to do with the Falcon engine?

Keep it. Make sure it gets correctly positioned in the coming months.

It appears that with the Oracle acquisition, the reason-to-exist for Falcon is regarded as gone (a non-Oracle-owned InnoDB replacement), previously seen as a strategic imperative – much delayed though.

But look, each engine has unique architectural aspects and thus a niche where it does particularly well. Given that Falcon exists, I’d suggest to not just “ditch it” but have it live as one of the pluggables. What Oracle will do to it is unknown, but Sun/MySQL can make sure of this positioning by making sure in the coming months that Falcon works in 5.1 as a pluggable engine, perhaps also creating a separate bzr project/tree for it on Launchpad.

Then the good work can find its way into the real world, now.

InnoDB lock timeout before query execution

I found this yesterday while tracking down a locking issue for a client. They had a connection time out on a lock, but before it times out, SHOW PROCESSLIST had the status ‘statistics’ so it wasn’t actually executing the query yet. So, what was it doing and why did it time out there?

The answer actually was remarkably simple, but I did have to take a peek in the MySQL server source code (horay for Open Source!) The server sets the thd_proc_info to ‘statistics’ when calling the join optimiser, that’s the part of the optimiser that works out the best join order.

A lesser known feature of the join optimiser is that if it works out that only one row can match (lookup on primary or unique key field), it’ll try to retrieve the row. If there’s no match, there’s an “impossible WHERE clause” and essentially the entire query is optimised away and 0 rows returned.

If there is a match, all references to columns in that table can be replaced with the values that were just retrieved, turning them into constants. After all, there’s only one row matching! This can optimise away tables, removing the need to execute some or all joins. You don’t actually see this directly if you do EXPLAIN EXTENDED and then SHOW WARNINGS, because that output is from the parse tree which does not know about the join structure. But what you will see there is that the columns were replaced with constants, so from that you can deduce what’s going on.

For a query
EXPLAIN EXTENDED SELECT name from Country where code='AUS'
you would see access type const in the explain (which is the indicator for this optimisation), and SHOW WARNINGS brings up
select 'Australia' AS `name` from `world`.`country` where 1
so you see the WHERE clause is gone completely and the name column in the SELECT has been fed the value Australia so it’s all constants.

Back to the original issue… if another connection holds an exclusive (write) lock on that particular row, the join optimiser will be waiting for the lock to be released and that’s how you can actually get a lock timeout at this stage. It’s a perfectly normal thing to happen.